Stephen Pollard 


qq Z ' ‘ 
NG r 
SS 


A Mathematical 
Prelude to 

the Philosophy 

of Mathematics 


o) Springer 


A Mathematical Prelude to the Philosophy 
of Mathematics 


Stephen Pollard 


A Mathematical Prelude 
to the Philosophy 
of Mathematics 


g) Springer 


Stephen Pollard 

Department of Philosophy and Religion 
Truman State University 

Kirksville, MO 

USA 


ISBN 978-3-319-05815-3 ISBN978-3-319-05816-0 (eBook) 
DOI 10.1007/978-3-319-05816-0 
Springer Cham Heidelberg New York Dordrecht London 


Library of Congress Control Number: 2014936236 


© Springer International Publishing Switzerland 2014 

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of 
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, 
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or 
information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar 
methodology now known or hereafter developed. Exempted from this legal reservation are brief 
excerpts in connection with reviews or scholarly analysis or material supplied specifically for the 
purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the 
work. Duplication of this publication or parts thereof is permitted only under the provisions of 
the Copyright Law of the Publisher’s location, in its current version, and permission for use must 
always be obtained from Springer. Permissions for use may be obtained through RightsLink at the 
Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. 
The use of general descriptive names, registered names, trademarks, service marks, etc. in this 
publication does not imply, even in the absence of a specific statement, that such names are exempt 
from the relevant protective laws and regulations and therefore free for general use. 

While the advice and information in this book are believed to be true and accurate at the date of 
publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for 
any errors or omissions that may be made. The publisher makes no warranty, express or implied, with 
respect to the material contained herein. 


Printed on acid-free paper 


Springer is part of Springer Science+Business Media (www.springer.com) 


In memory of Dean Randall Pollard 


@LAoco@owuEY dveu Yahaxtac 


Preface 


I have a principled argument for why this book should exist. I have no such 
argument for why it contains just what it contains. The principles are these: 


e You cannot understand philosophy of mathematics without understanding 
mathematics. 
e You cannot understand mathematics without doing mathematics. 


The main point of this book, with its 298 exercises, is to give students 
opportunities to recreate some mathematics that will illuminate important readings 
in philosophy of mathematics. As for the particular mathematical materials I have 
chosen: they are the unforced fruits of lengthy experience. I have taught under- 
graduates for three decades. In the 14 times I taught philosophy of mathematics, 
I discovered again and again that some important text was opaque to my students 
largely because they lacked some particular bit of mathematical background. Most 
of the missing bits came from a handful of subject areas: Primitive Recursive 
Arithmetic, Peano Arithmetic, Gédel’s theorems (completeness, compactness, and 
incompleteness), interpretability, the hierarchy of sets (especially the landscape of 
V(@+«)), Frege Arithmetic, and intuitionist sentential logic. This book offers 
exercises in these areas supported by explanatory materials and just a dash of 
philosophy. I have made no attempt to impose a grand unifying narrative. There is 
no central thesis. Some students of philosophy of mathematics have found it 
helpful to work their way into these subject areas. Other readers may enjoy the 
same experience. 

I cannot think of any undergraduate course in which this book would be an 
appropriate stand-alone text. I offer it as a supplement to primary texts chosen by 
instructors or automaths. For the benefit of the latter, the book offers some guid- 
ance about what those primary texts might be. Professors will, of course, make 
their own choices, both about primary texts and about assignments within this 
book. I would be amazed if any instructor assigned every section and every 
exercise. I do not see how there could be time for it. I would expect instructors to 
let their own interests guide them as they pick and choose. Not every choice will 
make sense. Chapter 2, for example, would be inscrutable without Chap. 1. Parts 
of Chap. 6 presuppose parts of Chap. 2. Chapters 4 and 5 build on Chap. 3. On the 
other hand, a leap-frog journey consisting of Chaps. 1, 3, and 7 would make 
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sense—as would many other paths over and around sections instructors might 
choose to skip. That is one respect in which the book’s somewhat motley character 
is a virtue. 

The readers most likely to benefit from this book are exactly those most likely 
to benefit from a philosophy of mathematics course. Those readers will have some 
background in formal logic, they will find mathematics engaging and non- 
threatening, they will understand basic properties of the natural and real numbers, 
and they will see the point of asking “why” and “how” questions that emerge 
from mathematical experience, but cannot be answered by just producing more 
mathematics. I would describe this last trait as “philosophical inclination” —and it 
is the inclination, rather than any particular philosophical training, that is likely to 
be most important here. (Once, in a philosophy of social science course, a student 
of mine interrupted a discussion of deductive-nomological explanation by asking, 
“Why would anyone want to understand what explanation is?” There, I suspect, 
was a student who had never felt a philosophical impulse in his life. I would not 
recommend this book to him.) 

I shared the joys and frustrations of a work-in-progress with my students. Some 
responded by dropping my course. Others stuck it out and helped make the book 
better. I like to think that, in return, they learned what it is like to do mathematics 
and philosophy. In any case, they have my thanks. Thanks, too, to Florence Emily 
Pollard for her careful reading of the penultimate draft. 


Kirksville, MO, USA, January 2014 Stephen Pollard 
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Chapter 1 
Recursion, Induction 


1.1 Numerals, Types, Tokens 


The mathematician DAVID HILBERT (1862-1943) had the odd but, as it turned out, 
fertile idea that we can generate all sorts of interesting mathematics if we reflect 
on the language of mathematics—if we talk about mathematical talk. This is the 
magic of mathematics: start with a dry, unpromising little seed; nourish it with logic, 
imagination, and obsessive energy; if the mathematics gods smile, you find yourself 
wandering in a whole forest of beautiful ideas. Hilbert’s little seed was a mathematical 
theory about the languages in which we express mathematical theories. 

When we talk about what Hilbert did, we will be talking about talk about math- 
ematical talk: we will be discussing a theory about the way mathematicians express 
themselves. We will do some of that in this chapter, talking about what Hilbert did. 
But, mainly, we will do, or re-do, what Hilbert did: we will experience a bit of 
Hilbert’s project from the inside. This may require some patience on your part. We 
will be investigating an especially primitive bit of mathematical language in an espe- 
cially meticulous way, working hard to prove elementary claims that you might have 
been perfectly happy to assume. The pay-off will be an insider’s view of one of the 
most fundamental, most secure, and most deeply understood areas of mathematical 
activity: an area known as PRIMITIVE RECURSIVE ARITHMETIC or PRA. We begin 
our development of PRA by considering an archaic way of naming positive integers. 

One way to answer a “How many?” question is to provide an example of how 
many. When the nice lady asks the little boy how old he is, the child holds up two 
fingers to show he is that many years old. When asked how old I am, I could, at the 
risk of appearing quite mad, display a piece of paper marked up like this: 


I could say, “Look: one mark for each year!” On each birthday, I could add a new 
tally mark. A year from now, my piece of paper would be marked up like this: 
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I would then be employing a very simple system of NUMERATION, a very simple 
system for representing NUMBERS with NUMERALS. 

Numerals are names of numbers. Numbers are the things named by numerals. The 
Roman numeral ‘X’ and the binary numeral ‘1010’ are different numerals, but they 
name the same number. In my simple system, the numeral ‘|’ names the number one, 
the numeral ‘| |” names the number two, and so on. Each of my numerals exemplifies 
the number it names: a numeral consisting of n tally marks names the number n. 

We now consider a distinction that is old hat to philosophers, but may not be so 
familiar to mathematicians. Suppose I hold up a piece of paper with pencil markings 
like this 


and you hold up a piece of paper with pencil markings like this 


Then we are each displaying the same numeral: the one numeral in my simple sys- 
tem that names the number four; the one numeral consisting of four tally marks. 
There are two pieces of paper each bearing its own graphite inscriptions; but each 
graphite inscription is a particular instance of one and the same numeral. Now sup- 
pose you burn your piece of paper. Have you destroyed the numeral ‘| | | |’? I would 
be inclined to say that you have destroyed a numeral without destroying the numeral. 
Philosophers have some terminology that is useful here. They distinguish between 
the numeral-TYPE that survives the incineration of your paper and the numeral-TOKEN 
that goes up in flames. The type is one thing that has many instances or tokens. You 
can write one numeral a hundred times on a chalkboard. A hundred instances of the 
numeral will then appear on the board. Each of those instances is a token of the one 
type. You can destroy the hundred tokens by erasing the board. It is not so clear that 
you can do anything to destroy the one type. 

For the rest of this chapter, when I refer to “numeral-tokens” and “numeral-types” 
I will mean tokens and types of numerals in our simple system of numeration. We now 
consider some properties of these tokens and types. Note, first, that each numeral- 
token is an instance of some numeral-type. To be a token is to be an instance of a 
type. To be a token is to “instantiate” a type. The very meaning of the word ‘token’ 
guarantees that there are no typeless tokens. The meaning does not guarantee that 
there are tokens or types. That would be an astounding thing for a meaning to do. 
The meaning only guarantees that if there were a token, there would be a type it 
instantiates. To take a more everyday example, the meanings of the terms ‘wife’ and 
‘spouse’ do not guarantee that there are wives or spouses. They do guarantee that if 
there were a wife, she would have a spouse. 

Each numeral-token is an instance of at least one numeral-type. Indeed, each 
numeral-token is an instance of exactly one numeral-type. The type of a numeral- 
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token is determined by how many tally marks it includes. The graphite inscriptions 
on our two pieces of paper are tokens of the type: “numeral consisting of four tally 
marks.” Neither inscription is a numeral consisting of three tally marks (though each 
has parts consisting of three tally marks). Neither inscription is a numeral consisting 
of five tally marks. In fact, each is an instance only of the type indicated. Since each 
numeral-token has a unique numeral-type, we can introduce some useful terminology. 


Definition 1.1 If x is a numeral-token, we let t (x) be x’s numeral-type. 


We can use our new symbol ‘t’ (the Greek letter tau) to state an important propo- 
sition. 


Proposition 1.1 /f x and y are numeral-tokens, then t(x) = t(y) if and only if 
there are exactly as many tally marks in x as in y. 


Remember this proposition. It will very soon prove useful. It is known, in the 
technical language of philosophy, as an IDENTITY CRITERION for numeral-types. It 
supplies necessary and sufficient conditions for numeral-types (such as t(x) and 
T(y)) to be identical to one another (t (x) = T(y)). At least, it does so for numeral- 
types that have tokens. We know that x is a token of t(x). A numeral-token is, 
necessarily, a token of its own numeral-type. We are leaving open the possibility, 
however, that there are types with no tokens. There are no typeless tokens, but we 
are open to the idea that there are tokenless types. Our identity criterion would not 
apply to such types. 

If we tried to formulate an identity criterion for numeral-tokens, we would face 
even thornier problems. Imagine two slips of paper, each bearing one tally mark. 
Bring the slips together to form a numeral-token y consisting of two tally marks. Do 
we have a clear idea of what changes to the slips would yield something other than 
y? What if we moved the slips one micron further apart? Would we still have y? If so, 
how many microns would we have to move the slips to get something other than y? 
What if we turned one of the slips upside down or applied a microscopic amount of 
additional graphite to it or moved the tally marks a tiny bit out of alignment? Would 
we still have y? We would have to think seriously about such questions before we 
could formulate an explicit identity criterion for numeral-tokens: a daunting task 
that I, frankly, am going to dodge. That does not leave us helpless. In this, as in 
sO many areas, our capacity to make reasonable judgments exceeds our capacity 
to write down rules for making those judgments. We can make sensible decisions 
about the identity or distinctness of numeral tokens without formulating necessary 
and sufficient conditions for their identity or distinctness. It is worth remarking, 
though, that the mathematics we will develop in this chapter is emerging from notions 
infected with much more vagueness than we would tolerate in the mathematics 
itself. 
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1.2 Immediate Succession 


We can construct longer and longer numeral-tokens by adding one tally mark at a 
time: 


If we add one tally mark to a numeral-token x, the only result of this construction 
that interests us, what we will call THE RESULT, is the new numeral-token consisting 
of all the tally marks of x together with the tally mark we added. (If any other tally 
marks have appeared or any of x’s original tally marks have disappeared, then we 
have not successfully performed the intended operation.) We say that this result is 
what the construction “add one tally mark” YIELDS. If we add one tally mark at the 
end of a numeral-token x consisting of six tally marks, then one of the new objects 
we create is the inscription consisting of the last four tally marks in x together with 
the tally mark we added. 


—>——" 


Even if we regard this inscription as a numeral-token (rather than just a part of a 
numeral-token) and concede that it is one result of our construction, it is not what 
we call the result. Our construction does not yield this numeral-token in our special 
sense of “yield.” 

In the next definition, we perform the everyday mathematical trick of compressing 
a lot of words into a little symbol: in this case, ‘vu’ (the Greek letter upsilon). 


Definition 1.2. v(xy) ifand only if y is the result of adding one tally mark to numeral- 
token x. 


Suppose y is the result of adding one tally mark to x while z is the result of adding 
one tally mark to w. That is, v(xy) and v(wz). Suppose, further, that t(x) = t(w). 
Then, according to our identity criterion for numeral-types, x and w feature the same 
number of tally marks. If we add one tally mark to each, we will still have the same 
number of tally marks. So, applying our identity criterion again, t(y) = t(z). We 
have confirmed the following entailment. 


T(x) = t(w) => t(y) = TQ). 


Suppose, conversely, that t(y) = t(z). This means that when we add a tally mark to 
x and a tally mark to w we get tokens with equal numbers of tally marks. So x and 
w must have equal numbers of tally marks and, hence, t(x) = t(w). This confirms 
the converse of our earlier entailment. 


T(y) = T(z) => T(x) = TW). 


All of this establishes the following proposition. 
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Proposition 1.2 If u(xy) and v(wz), then 


T(x) = tw) => t(y) = tT). 


Suppose x consists of two tally marks while y consists of three. There is no 
guarantee that u(xy). Numeral-tokens can differ by one tally mark without either 
being a lengthening of the other. x might have been written on a chalkboard in Tuvalu 
fifty years ago and immediately erased, while y first appeared yesterday on a piece 
of paper in Missouri. Still, somewhere in the vast expanse of space-time, we would 
expect to find numeral-tokens w and z with the following properties. 


Tw) = T(x), T(z) =T(y), vGwz). 


We would expect to find a token consisting of three marks that is a lengthening of a 
token consisting of two marks. For example, z might be y itself and w might consist 
of the first two tally marks in y. These reflections inspire the following definition. 


Definition 1.3. o (xy) if and only if a numeral-token of the same type as y is the 
result of adding one tally mark to a numeral-token of the same type as x. 


The idea is that two numeral-tokens will stand in this relation o (“sigma’’) if and 
only if the second token features exactly one more tally mark than the first (whether 
or not the second is the result of adding a tally mark to the first). We will now confirm 
some important facts about o. Suppose o (xy). Then we can pick tokens x’, y’ with 
the properties we discussed above: 


tx)=t(x), tO’) = ty), vO'y’). 
Now suppose o (xz). Pick x”, z’ such that 
t(x")= r(x), tZ)=T@), v%"z’). 


Then t(x’) = t(x”) and, hence, by Proposition 1.2, t(y’) = T(z’). So t(y) = T(z) 
and, on the assumption that o (xy), we have confirmed the following entailment. 


o (xz) => T(y) = T(z). 
Suppose, conversely, that t(y) = T(z), still assuming that o (xy). Then t(y’) = T(z) 
and, hence, a numeral-token of the same type as z (namely y’) is the result of adding 
one tally mark to a numeral-token of the same type as x (namely x’). So o (xz) and 
we have confirmed the converse of the earlier entailment. 


T(y) = T(z) => o (xz). 


Our reward is a new proposition. 
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Proposition 1.3 [fo (xy), then 
oO (xz) <=> T(y) = T(z). 


You can get warmed up by proving a closely related proposition. 


Exercise 1.1 Prove: ifo(xy), then 
o(zy) => TX) = TZ). 


o is arelation between numeral-tokens. There is a corresponding relation between 
numeral-types: the relation of IMMEDIATE SUCCESSION. 


‘!?'immediately succeeds '‘l’ 
‘|| 'immediately succeeds ‘I’ 
‘I11Pimmediately succeeds ‘III’ 
‘III? immediately succeeds ‘III’ 


And so on. This relation satisfies the following proposition. 
Proposition 1.4 t(y) immediately succeeds t (x) if and only if o (xy). 


Note that we are using a relation between tokens (the relation o) to character- 
ize a relation between types (the relation of immediate succession). We will now 
note some important facts about immediate succession. Suppose Tt (y) and t(z) both 
immediately succeed t(x). Then, by Proposition 1.4, o(xy) and o (xz) and, hence, 
by Proposition 1.3, t(y) = t(z). Say that a type is instantiated if there are tokens 
of that type. We have shown that an instantiated numeral-type will have at most 
one instantiated immediate successor. We are going to suppose that numeral-types 
behave like this whether or not they are instantiated. 


Proposition 1.5 Each numeral-type has at most one immediate successor. 
Now you need to do a little work to prepare the way for our next proposition. 


Exercise 1.2. Show that an instantiated numeral-type will immediately succeed at 
most one instantiated numeral-type. 


We are going to suppose that numeral-types behave as in Exercise 1.2 whether or 
not they are instantiated. 


Proposition 1.6 Each numeral-type immediately succeeds at most one numeral-type 
(so numeral-types that share an immediate successor are the same). 


In our system of numeration, at least one tally mark appears in each numeral-token. 
So you cannot add a tally mark to one of our numeral-tokens and end up with a single 
tally mark all by itself. Suppose t(y) =* |’. And suppose ‘|’ immediately succeeds 
t(x). Then, by Proposition 1.4, o (xy) and, hence, by Definition 1.3, a numeral-token 
of the same type as y is the result of adding one tally mark to a numeral-token of the 
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same type as x. But then, by Proposition 1.1, a numeral-token consisting of a single 
tally mark is the result of adding one tally mark to a numeral-token of type T(x). 
Since this is impossible, we conclude that ‘|’ does not immediately succeed any 
instantiated numeral-type. We assume, as usual, that uninstantiated numeral-types 
will behave the same way. 


Proposition 1.7 ‘|’ does not immediately succeed any numeral-type. 


1.3 How Many Types? 


Each numeral-type has at most one immediate successor. Does each numeral-type 
have at least one immediate successor? I do not know. I would never claim for such 
a proposition the sort of certainty we normally associate with mathematical results. I 
am pretty sure, though, that the proposition is not fundamentally incoherent: it seems 
logically possible. In alogically impossible scenario, anything goes: in classical logic, 
everything follows from a contradiction. We, however, are going to suppose that we 
can distinguish between what would and what would not be the case if every numeral- 
type had an immediate successor. So it should be worthwhile to explore what things 
would be like if this were really so. Furthermore, it seems conceptually possible that 
every numeral-type has an immediate successor. That is, this would be compatible 
with our concept of a numeral-type. So, when we explore what things would be like 
under these circumstances, we continue to investigate numeral-types, not just things 
we call “numeral-types” without any definite idea of what, if anything, we are really 
discussing. 

If 6 is an immediate successor of a, Proposition 1.5 allows us to describe it as the 
(one and only) immediate successor of a. This justifies some new terminology: 


and so on. S is the operation or function that takes us from a numeral-type to its 
immediate successor. In our new notation, Propositions 1.6 looks like this 


S(a) = S(b) => a=6 
while Proposition 1.7 looks like this 
| # S(a) 
with ‘a’ and ‘b’ understood to be variables that range over all numeral-types (that 


is, symbols that allow us to make claims about all numeral-types). These two 
propositions let us prove that there are infinitely many numeral-types (at least in the 
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possible world we are exploring). To get an idea of why this is so, suppose there are 
only three numeral-types: |, S(|), and S(S(|)). Each numeral-type has an immediate 
successor. What, then, is the immediate successor of S(S(|))? By Proposition 1.7, 
| 4 S(S(S(|))). On the other hand, if S(|) = S(S(S(|))), then Proposition 1.6 implies 
that | = S(S(|)), contrary to Proposition 1.7. Finally, if S(S(|)) = S(S(S())), then 
two applications of Proposition 1.6 imply that | = S(|), contrary to Proposition 1.7. 
So |, S(|), and S(S(|)) cannot really be the only numeral-types because that would 
leave S(S(|)) without an immediate successor. 

I should point out that I am being a little sloppy with my notation. We usually use 
quotation marks when we name names. | | is the number whose name in our system 
of numeration is ‘| |’. When I write 


I} = Sd) 


this may appear to say that the number | | | is the result of applying S to the number 
||. The equation 


TWP = SCIP) 


more clearly states that the numeral-type ‘|||’ is the result of applying S to the 
numeral-type ‘| |’. A disadvantage of this more careful approach is that writing down 
all the quotation marks becomes prohibitively tedious after the first few dozen pairs. 
In self-defence, we adopt the convention that tokens of, say, ‘| |’ can name ‘| |’ itself. 
That is, two tally marks without quotation marks can instantiate the numeral-type 
‘||? and name it. 

If tokens of type a name object x, it seems reasonable to say that type a is itself a 
name of x: types name what their tokens name. So, if tokens of a numeral-type are to 
name that very numeral-type, the numeral-type will name ifse/f. Is this a problem? 
It will turn out that we are exploring a world in which our numeral-types have all 
the mathematically important properties of the numbers they are supposed to name. 
So it should not create mathematical difficulties if we let our numeral-types name 
themselves: they will still be naming things that behave like numbers. This may 
or may not render our numeral-types ambiguous. Each numeral-type will, indeed, 
name both a numeral-type (itself) and a number. Furthermore, at the beginning of this 
chapter, I emphasized the distinction between a name and what the name names. In 
general, this is an important thing to keep straight. But you might consider whether, in 
this special case, the distinction is unwarranted. When imagining a situation in which 
numeral-types have all the mathematically important properties we normally attribute 
to numbers, it may be convenient to identify each numeral-type with the number it 
names. There will then be no ambiguity: each numeral-type will name exactly one 
thing. Furthermore, there may be a philosophical benefit. If you understand what 
numeral-types are, you will know what numbers are. What are numbers? They are 
numeral-types. Conversely, if you understand what numbers are, you will know 
what numeral-types are. What are numeral-types? They are numbers. On the other 
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hand, if you already understand what both numeral-types and numbers are and you 
understand them to be different things, you may not be so happy with my proposal 
to identify them. I will not insist upon such an identification. 


1.4 Recursive Definition 


In number theory, we can define primality (the property of being prime) by stating 
necessary and sufficient conditions for a number to be prime. 


A natural number is prime if and only if it is greater than one and its only divisors are one 
and itself. 


We could also define primality by describing a mechanical procedure for determining 
whether a number is prime. Computer science majors could accomplish this by 
writing a program. Our inspiration in this section will be definitions of this latter 
sort. We are going to define operations on numeral-types by offering procedures 
rather than verbal equivalents. Our first example is a definition of addition. 


Definition 1.4 
a+ |= S(a) 


a+ S(b) = S(a+ 6). 


To get some practice with Definition 1.4, we are going to use it to figure out what 
|| + ||| 1s. Our approach will not be very subtle: we will figure out what || + | 
is; we will use that information to figure out what | | + || is; then we will use that 
information to figure out what || + ||| is. This is the sort of thing a machine might 
do, relentlessly advancing one little step at a time with no great intuitive leaps. As for 
the nuts-and-bolts of applying Definition 1.4, we are, once again, using the variables 
‘a’ and ‘b’ to make claims about a// numeral-types: our definition says that certain 
relationships will hold no matter what a and 6 are. So we should feel free to replace 
any occurrences of ‘a’ or ‘b’ with any strings of tally marks we wish. Replacing ‘a’ 
with ‘| |’ in the first clause of the definition, we get: 


I}]+}=SQ)=III- 


So now we know what | | + | is. Applying this result and the second clause of the 
definition, we determine that 


H+H=ll+S0)=Sdl+)D=Sd1D =I. 
So now we know what | | + | | is. Repeating this procedure, we determine that 


H+ Hl=H+ Sd) = Sd1+ID = SCID =H 
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We conclude that ||+ ||| = |||||. More generally, we have a mechanical procedure 
for determining what a+ 6 is as long a and 6 are presented to us in a way that allows 
us to determine how many tally marks their tokens would have. (If we only know 
that a and 6 are God’s two favorite numeral-types, we are in trouble.) Just a bit ago, 
we discussed one way of so presenting them: using their own tokens to name them; 
for example, writing 


to refer to the type “numeral consisting of three tally marks.” 

Definition 1.4 is an example of a RECURSIVE DEFINITION. A recursive definition 
tells us what value an operation yields when we feed it ‘|’ and then tells us how 
to calculate the output at any successor stage: a calculation in which we apply an 
operation to the output at the prior stage (Definition 1.4), to the input at the prior stage 
(Definition 1.5), or to both (Exercise 1.6). Among the operations we are allowed to 
apply are the successor operation S, the “null operation” that assigns a to each input 
a, and any operations we have already recursively defined. In Definition 1.4, we 
apply S to the output at the prior stage. In Definition 1.5, we apply the null operation 
to the input at the prior stage. A recursive definition allows us to calculate values for 
arbitrary inputs by starting from ‘|’ and applying a fixed procedure over and over 
(that is, recursively). 


Definition 1.5 
pred(|) =| 


pred(S(a)) =a. 


‘pred’ stands for “predecessor.” We could think of it as the instruction “erase one 
tally mark,” with the slight glitch that, if you start with just one tally mark, you leave 
it be. (We are supposing there is no numeral-type whose tokens consist of no tally 
marks.) So 

pred(||) = pred(S(|)) =| = pred(\). 


That is, pred treats | as the “predecessor” of both | and ||. Otherwise, pred is 
well-behaved. 
pred(|||) =|| 


pred(\||))=I11| 


and so on. 


Exercise 1.3. Letting the variable ‘a’ range over people, define the function child 
as follows. 
child( the father of a) = a 
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Give an example of an absurd conclusion that follows from this definition. What is 
there about S that makes our definition of pred legitimate while the definition of 
child is illegitimate ? 


Definition 1.6 
a—|= pred(a) 


a—S$(b) = pred(a—b) 
Exercise 1.4 Use Definitions 1.5 and 1.6 to calculate | ||||—| ||. 


Exercise 1.5 Use Definitions 1.5 and 1.6 to calculate |—| | |. 


Exercise 1.6 Recursively define the function f(a) =|+||+]|||+---+ 4 (letting 
fM=D). 


1.5 Proof by Induction 


Although nothing I have said so far rules out numeral-types “infinitely distant” from 
‘|’, [really do mean to rule them out. The idea is that, if a is a numeral-type, then 


a= S(S(S(...S()...))) 


where the ellipses ‘...’ represent finitely many applications of the successor opera- 
tion. Mathematical induction is an attempt to capture this idea. It is also a powerful 
method of proof. 

If ‘|’ has a disease and each numeral-type infects its immediate successor, how 
many numeral-types will catch the disease? ‘| | | |’ will catch it from ‘| | |” who caught 
it from ‘| |’ who caught it from ‘|’. 


Im Tl Tl TT 
“TTL 1 |? will catch it from ‘| | || |’ who caught it from ‘| |||’. 
PO ee Uses 


In fact, every numeral-type will catch the disease because every numeral-type is only 
finitely many successor steps away from ‘|’ and, so, stands at the end of one of these 
chains of infection. 

This gives us a way to show that every numeral-type has some property. First, 
show that ‘|’ has the property. Then show that the property is “infectious”: that S(a) 
has it whenever a does. In more standard terminology, the goal is to show that the 
property is HEREDITARY. If ‘|’ has a hereditary property, it will follow that every 
numeral-type inherits the property. Here are some examples. 
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Theorem 1.1 | + 6 = S(6). 
Proof We first confirm that ‘|’ has the desired property. This would mean that 

|+ |= S() 


which is an immediate consequence of Definition 1.4. We next assume the INDUCTIVE 
HYPOTHESIS that an arbitrary numeral-type has the desired property. Actually, no 
numeral-type is more or less arbitrary than any other. It is we who are to treat the 
numeral-type as arbitrary by not using any special information about it: by only 
making inferences that would apply to any other numeral-type. To guarantee that 
we do not cheat, we will give our “arbitrary numeral-type” a silly name that reveals 
nothing about it: say, ‘~’. Our inductive hypothesis is that 


J+ = S(). 

We now try to confirm that ~ passes this property on to S(—). We want to show that 
the preceding equation still holds when we replace each ‘~’ with ‘S(—)’. When we 
perform this substitution on the left side of our inductive hypothesis, we obtain 

| + S(~). 
When we perform this substitution on the right side of our inductive hypothesis, we 
obtain 

S(S(~)). 
So our goal is to verify that 


| + Sm) = SSC). 


We now find equations that take us from the left side of this equation to the right 
side. By Definition 1.4, 


[+ SW) = S(J+~). 
Applying S to both sides of our inductive hypothesis, we obtain 
S(J+~—) = S(S(~)). 


So, as desired, 
|+ S(~) = S(S(~)). 


That is, S(~) inherits the desired property from ~. Since ~ could be any numeral- 
type, we have shown that the property is hereditary. So, since ‘|’ has it, every numeral- 
type has it. 
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Theorem 1.2 S(a)+6= S(a+ 5b). 


Proof Since we are faced with two variables, ‘a’ and ‘b’, there are a couple of ways 
to approach this problem. We could try to show that the formula 


S(C__)+b=S(__ +6) 


comes out true no matter how we fill in the blank. Or we could try to show that the 
formula 


S(a)+_ = S(a+ _) 
comes out true no matter how we fill in the blank. Let us try the second approach. 
(If we get stuck, we can go back and try the first approach.) To begin, we want to 
confirm that we get something true when we put ‘|’ in the blank. That just requires 
two applications of Definition 1.4: 
S(a) + | = S(S(a)) = S(a+ |). 
Now we deploy our inductive hypothesis: 
S(a)+~ = S(a+~). 
An inductive hypothesis is not always immediately useful. You may have to wander 
around a bit before you see why it is helpful. Just don’t forget about it! Our goal is 
to show that S(~) behaves like ~. That is, we want to confirm that 
S(a) + SC) = S(a+ S(~)). 

As in the preceding proof, we pick one side of this equation and try to transform 
it into the other side. We will go from left to right, moving from S(a) + S(~) to 
S(a+ S(~)). Definition 1.4 assures us that 

S(a) + S(—) = S(S(a) + ~). 
So, by our inductive hypothesis, 

S(a) + S(~) = S(S(a+ ~)). 
Another application of Definition 1.4 now yields 


S(a) + S(~) = S(a+ S(~)) 


just as we hoped. 
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You might find Theorems 1.1 and 1.2 useful in some of the exercises below. In 
Exercise 1.8, your inductive hypothesis could have the form: 


G@+t~)—~ =a. 
Of course, you do not have to use the smiley face; but, whatever symbol you use, you 
are to treat it as if it were a constant term: the name of a numeral-type. The term ‘a’, 
on the other hand, is a variable that ranges over all numeral-types. Your inductive 


hypothesis is a claim about all numeral-types a. So, whenever it seems useful, you 
can feel free to replace ‘a’ with something else. For example: 


(S(a) + ~) —~ = S(a). 


Exercise 1.7 
a+b=b6-+a. 


Exercise 1.8 
(a+6)—b=a. 


Exercise 1.9 


a—(b+c) = (a—b)—c. 


Exercise 1.10 
|—b=||—b=|. 


1.6 A Characteristic Function 


After two more exercises, we will consider a new function. When you do these and 
any subsequent exercises, feel free to use the results of previous exercises. 


Exercise 1.11 
S(a)—a= |. 


Exercise 1.12 
S(a)— S(b) = a—b. 
Now for the new function. 


Definition 1.7 
id(a, b) = ||||—((S(a) —b) + (S(6) —a)). 


Exercise 1.13 
id(a, 6) = id(6, a) = id(S(a), S(6)). 


1.6 A Characteristic Function 15 


Since no sum of numeral-types can be | and, hence, (S(a) — b)+ ($(6) — a) cannot 
be |, id(a, 6) has to be of the form 


HITT] or (ITI or TT--{IP or THIT-- TIP or 
So id(a, 6) has only two possible values: | and ||. We can express this as an equation 
pred(id(a, b)) =|. 


(Note that pred yields | only when applied to | or ||. So this is an indirect way of 
saying that id(a, 6) is | or ||.) The next two exercises will let you verify that this 
equation is correct. 


Exercise 1.14 
pred(a—b) = pred(a)—b. 


Exercise 1.15 
\l|—(a+ 6) =|. 


Here are two exercises that tell us something about when id yields each of its two 
possible values. 


Exercise 1.16 
id(a, a) = ||. 


Exercise 1.17 
id(a,a+b) =|. 


A CHARACTERISTIC FUNCTION answers a yes-or-no question about the values we 
feed it. When we feed a pair of values to id, the question it answers is whether the 
values are the same. || mean “yes” and | means “no.” That is, 


: l|if a=b 
nd a | otherwise. 

This may seem like a pretty boring function, but it is of interest to logicians because 
it allows us to define logical expressions such as ‘not’, ‘or’, ‘only if’? and more. This 
shows that some parts of logic can be captured in elementary arithmetic. Although 
philosophers, logicians, and the occasional mathematician argue energetically about 
the proper analysis of various logical expressions, the behavior of the logical oper- 
ators definable in our theory of numeral-types is well understood and not a source 
of controversy. We will consider some of these operators in the next section. First, 
though, we prove an obscure looking fact that will turn out to be useful. 


Theorem 1.3 id(a, 6) + id(id(a, 6), |) = |||. 


The next two exercises supply an inductive proof of this theorem. 
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Exercise 1.18 
id(a, |) +id(id(a, |), |) = III. 


Exercise 1.19 As an inductive hypothesis, assume that 
id(a, ~) + id(id(a, ~), |) = III. 
Your goal is to show that 
id(a, S(~)) + id(id(a, S(—)), |) = Ill. 


You might do a second induction to confirm that this really does hold for every a. 
You would then want to prove: 


id(|, S(~)) + id(Gid(|, S(~)), |) = Ill 


and 


id(S(Q), S(~)) + id(id(SO), S(—)), I) = Ill- 


(‘QO is another one of our names for an “arbitrary numeral-type.” As you reason 
about Y, you should remember that you are allowed to replace ‘a’ in our earlier 
inductive hypothesis with any terms you wish, including terms in which ‘Q’ appears.) 


1.7 Some Logic 


Back in Sect. 1.3, I assumed that inequalities (such as: “| 4 S(a)’) were well- 
understood. As natural as this assumption was, we can now see that it is unnecessary. 
Each inequality 


ae ae 


is equivalent to an equation 


We can always say that some numeral-types are different by asserting that some other 
numeral-types are the same. So we could have treated ‘4’ as a defined expression. 
Let us make a fresh start and do just that. 


Definition 1.8 a 746 <=> id(a,b)= |. 


Exercise 1.20 


| # S(a). 


When it seems convenient, we move the negation sign to the front: 
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—-(a=b) => aFb. 
This convention and Definition 1.8, yield the following equivalences. 
id(id(a, 6), |) =| = > —(id(a,b) =|) — > —-(-(a=65)). 


The right-hand formula (the one with the two negation signs) is called a DOUBLE 
NEGATION. Assessing multiple negations of our equations requires no philosophical 
or linguistic analysis: it is simply a matter of calculation. The next exercise asks you 
to evaluate a triple negation. 


Exercise 1.21 Confirm that —(—(—({||| = ||))). 


Given any two equations, we can use an inequality to assert that at least one of 
them is true (though it might not be immediately obvious that the following definition 
does the trick). 


Definition 1.9 a=b v c=0 <=> id(a,b)+id(c, 0) F¥| || 


V is the DISJUNCTION operator. You can read it as “or.” If you have studied logic, you 
have probably seen the TRUTH TABLE for disjunction. 


The idea is that a disjunction is false if and only if each of its components is false. If 
I assert “@ or w,” I am asserting that at least one of the alternatives @, wy is true. So 
I am wrong if and only if they are both false. Now consider the following table. 


a=B y=6 id(a,B) id(y,5) id(a,B)+id(y,6) a=Bv y=s 
T T l| l| HI T 
T F l| | Hl T 
F T | l| ll T 
F F | l| F 


If I assert “a = 6 or y = 6,” I am asserting that id(a, 6) and id(y, 5) are not both 
| and, hence, that id(a, B) + id(y, 4) is not ||. Definition 1.9 guarantees that the 
disjunction of two equations is false if and only if both equations are false. In the 
next exercise, you will prove a disjunction central to classical logic. This might be a 
good time to recall Theorem 1.3. 


Exercise 1.22 
a=bvaFb. 
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This is a version of the LAW OF EXCLUDED MIDDLE or LEM. Each instance 
of LEM offers two alternatives: “Either ¢ or not ¢.” What is excluded is some 
alternative beyond these two. You might check to see whether we used LEM in any 
of our reasoning above. (If so, we might just be reasoning in a circle.) 


Exercise 1.23 Look up Grelling’s Paradox and be prepared to say why the following 
instance of LEM is questionable: “The adjective ‘heterological’ either is or is not 
heterological.” 


The following definition will help us introduce another logical operator. 


Definition 1.10 


| 
SY fO=FO 


i=| 


S(a) a 


> FO = f(S@) + D2 FO 


i=| i=| 


> is the SUMMATION function. That is, 


DS IO=LOFFMD +--+ £0. 


i=| 


You can let f be any function you can define using the resources of this chapter. 
For example, you could let f(i) be S(Z@) —i, as in Exercise 1.24, or id(i, |||), as in 
Exercise 1.25. 


Exercise 1.24 


YS@+i =a. 
i=| 


Exercise 1.25 
II 


dS idG IID = II. 


i=| 
We can generalize the result of the preceding exercise. Consider any sum of the 
following form: 


a 


dS id( FW, 6) = id(f (|), 6) +id(f(||), 6) +--+ +id(f (a), 6). 


i=| 


If each of the equations 
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fM=6, FID =6, ..., f(a) =6b 


were false, then each of the terms 


id(f (|), 6), id(f(I[), 6), ... ,td( f(a), b) 


would equal | and, hence, their sum would equal a. Their sum would differ from a if 
and only if some of the equations 


fM=6, fUD=6, ..., f(a) =6b 


were true. This leads us to the following definition. 


Definition 1.11 


Ax <a f(x)=b => Dlid(f, 6) #a. 


i=| 


Definition 1.11 introduces the BOUNDED EXISTENTIAL QUANTIFIER: 4x < : 


It is to be read as, “There is an x no greater than __ such that . . ..” For example, the 
sentence 

Ax < III] S@) = Ill 
says there is a numeral-type no greater than ||||| whose successor is |||. Here ||||| 


forms an upper bound for the values of the variable ‘x’. If you complain at this point 
that Ihave not yet defined “no greater than,” I will have to plead guilty. I have defined 
the expression ‘4x < |||||’ without defining the expression ‘<’. That does not make 
the definition any less legitimate. You could apply the definition without having any 
independent idea of what ‘<’ might mean. Informally, though, it might make the 
definition easier to digest if you think of < as the natural ordering of numeral-types 
in terms of the number of tally marks in their tokens: 


[<<< ills till <iils... 


Exercise 1.26 
Ax < ||| x + |] =I. 


To indicate how useful our new quantifier is, I will use it to define divisibility 
and primality. First, though, I need to define multiplication and the CONJUNCTION 
operator A (to be read as “and”’). 


Definition 1.12 
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Definition 1.13 a=b A c=0d <> id(a, b) +id(c, 0) = |III. 


Exercise 1.27 Explain Definition 1.13 in the way we earlier explained Definition 
1.9. (You might start by supplying the truth table for conjunction.) 


To express the idea that 0 divides a without remainder, we start with the scheme 
dx <a f(x) =6 


from Definition 1.11 and, letting 


g 


f(x) =x- 


b=a 


we obtain 
dx<ax-0=a. 


0 divides a without remainder if and only if a is the product of 0 and a term x no 
greater than a. Since we need not look beyond a when searching for the quotient x, 
a bounded existential quantifier meets our needs perfectly well. 


Definition 1.14 dja = > Ax <a x-d0=a. 
Definition 1.15 prime(a) — > a¥¢| A —Ax < pred(a)(x #| A xa). 


When testing a for primality, our search for a’s divisors does not even carry us as 
far as a. So we can, once again, make do with a bounded existential quantifier. 


Exercise 1.28 Define the BOUNDED UNIVERSAL QUANTIFIER Vx <____ (to be read 
as: “for all x no greater than __”). 


1.8 IT° Sentences 


The PRIMITIVE RECURSIVE functions are those obtainable from the numeral | and the 
function S by composition and recursive definition.! Definitions 1.4, 1.5, 1.6, 1.10, 
and 1.12 are all recursive. Definition 1.7 introduces the function id by composing 
S and two recursively defined functions (addition and subtraction). Definitions 1.8, 
1.9, 1.11, 1.13, 1.14, and 1.15 just introduce abbreviations for expressions already 
defined. So all the functions defined in the previous four sections are primitive recur- 
sive. Furthermore, all the defined predicates, relations, and logical operators can be 


' T am not going to supply rigorous definitions of composition and recursion. A good source for 
a more thorough treatment is the WIKIPEDIA article on primitive recursive functions (http://en. 
wikipedia.org/wiki/Primitive_recursive_function). For an especially meticulous (though perhaps 
not so readable) presentation, see Curry [3] (http://www.jstor.org/stable/2371522). 
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replaced by primitive recursive characteristic functions. Indeed, every proposition 
we can formulate with our defined expressions (and with any expressions similarly 
defined) will be equivalent to a sentence of the form 


f(a) =| 


where / is primitive recursive. 

Is that obvious? Hardly. You may even even have doubts. One reason for skepti- 
cism is that we can easily formulate propositions that have no occurrences of ‘a’ or 
any other variable. How can a variable-free formula be equivalent to a generalization 
of the form f(a) = |? There are, in fact, all sorts of sneaky ways to arrange for such 
an equivalence. For example, the formula 


S() # | 


is equivalent to the generalization 
id(S(|—a), |—a) = | 


since Exercise 1.10 guarantees that |—a = |. 


Exercise 1.29 Write down a sentence of the form f(a) = | (f primitive recursive) 
equivalent to ‘| + |= ||’. 


Another worry is that some of our sentences feature more than one variable. How 
could a sentence like ‘a+ 6 = 6+ a’ be equivalent to a sentence with occurrences of 
only one variable? The answer is that there are tricks that allow us to code up finite 
sequences of terms using just one term. We will now consider one such trick. First, 
we define exponentiation. 


Definition 1.16 
ad=a 


b 


a8) —q.q 


It would now be helpful to have a characteristic function for divisibility. That is, 
we would like a function div such that 


|| if dla 


div(Q,a) = | otherwise. 


To confirm that we can define such a function, note that the following are equivalent. 


22 


dx<ax-d0=a 
a 


idl -9, a)Aa 


i=| 


a 
id(> id(i -2,a),a) =| 


i=| 


id(id(D) id@- 0,4), 4), |) =|]. 


i=| 


So, since 0|a if and only if dx <a x-0= a, wecan let 


a 
div(d,a) = id(id(>) id(i -0, a), a), |). 


i=| 


Now consider 
\| lI 


CD 4iv ID) =H! 
i=| 
or, to revert to more familiar terminology, 
22 


(>. div(2', 27))~2?. 


i=1 


1 
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2! and 2? divide 2”, while 2? and 2+(= 22°)) do not. So the first two terms of our 


summation are 2, while the last two are 1. That is, 


92 


(Do div(2',2°)) +2 = (24+24141)+2 =2. 


i=1 
Similarly, 


93 


(> div(2’, 27)) +2? = C494 3414 14114 te? =5, 


i=l 
Note, too, that 


32 


(>¢ div(3', 37)) 3? = (24241414+141414141)+37? =2. 


i=1 
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If a does not divide c, then 


a®-c 


i diva', ge c)) ab -c=b. 


i=l 
This motivates the following definition. 
Definition 1.17 ; 
exp(a, 6) = (>) div(a', b)) ~b. 


i=1 
If a divides 6, then exp(a, 6) is the number of times it does so. For example, 
exp(2, 432) = exp(2, 24-37) =4 


while 
exp(3, 432) => exp(3, 94 . 33) = 3. 


Some may find it helpful to note that if a is even, then exp(2, a) is the exponent 
of 2 in the prime factorization of a. More generally, we can use exp to extract the 
exponent from any term in a prime factorization. 

How does this help us use a single term to code up a sequence of terms? Suppose 
we want to express the commutativity of addition using just one variable. We would 
be looking for a sentence that implies each instance of the generalization 


a+b=b6-+a. 


Try this: 
exp(2, a) + exp(3, a) = exp(3, a) + exp(2, a). 


Since this is itself an instance of the generalization we are trying to capture, it does 


not say more than it should. But does it say enough? For example, does it imply that 
5+4=4+ 5? To see that it does consider the following instance: 


exp(2, 2592) + exp(3, 2592) = exp(3, 2592) + exp(2, 2592). 
Note that 2592 = 25 . 34. So exp(2, 2592) = 5 and exp(3, 2592) = 4. This yields 


the desired conclusion: 5 + 4 = 4+ 5. There is nothing special about 4 and 5: any 
two numbers would have worked. So 


exp(2, a) + exp(3, a) = exp(3, a) + exp(2, a) 


does express the commutativity of addition using just one variable. Putting everything 
in the canonical form f(a) = | takes just a little more work. One solution is 
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id(id(exp(||,a) + exp(|||, 4), exp(|||,a) + exp(||,a)), 1) =| 


which is the double negation of ‘exp(| |, a)+exp(|||,a) = exp(|||, a)+exp(| |, a)’. 


Exercise 1.30 Write down a sentence of the form f (a) =| (f primitive recursive) 
equivalent to ‘a—(b+c¢) = (a—b)—c’. We are helped here by the fundamental 
theorem of arithmetic: every natural number greater than | has a unique prime 
factorization. 


A sentence is said to be /T ‘ if it is equivalent to a sentence of the form f(a) = | 
where / is primitive recursive. You might think that [7 i sentences are all quite trivial, 
but that is wrong. The following two sentences are IT i (In the second sentence, I 
write ‘|| ‘a’ for “| | does not divide a.”’) 


atl 4 ptll 2 6+ 
I| fa V a=|| Vv Ax < ady < a(prime(x) A prime(y) Ax+y =a). 


The first is Fermat’s Last Theorem; the second is Goldbach’s Conjecture (that every 
even integer greater than two is the sum of two primes). A measure of their depth is 
the failure of mathematicians as profound as Euler and Gauss to either prove or refute 
them. Fermat’s Last Theorem was proved in 1994. Goldbach’s Conjecture remains 
unresolved. 


1.9 Some Philosophy 


David Hilbert was convinced that many, though not all, mathematical formulas lack 
content (Inhalt in German): while many formulas might seem to make definite claims 
about some definite subject matter, they do not really say anything about anything. 
Although there is room for debate about where exactly Hilbert would draw the line 
between formulas with and without content, it is reasonably clear that he thought 
IT : sentences, interpreted as claims about numerals, have content (or, at least, this is 
what he ought to have thought given various other things he said). 

Hilbert offers some obscure reasons for believing that numerals, particularly our 
numerals 


are well-suited to be the subject matter of meaningful mathematics. For example, 
while he seems to concede that we produce tokens of a numeral in various ways, 
resulting in tokens that differ slightly from one another, he insists that we can recog- 
nize in all of them, wherever and whenever they might occur, a single form or shape 
(Gestalt). 

Students of Plato might, at this point, recall Symposium 211a-d where Diotima 
discusses beauty itself, “always one in form,” not beautiful at one time, ugly at 


2 See Hilbert [5], p. 163; English translation in Mancosu [6], p. 202. 
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another, or beautiful at one place, ugly at another. Indeed, beauty itself is the one 
form shared by all beautiful things, just as a numeral-type is the one form or shape 
shared by all tokens of that type. Plato was sure that the one form could be better 
known than any of the many instances: well-founded beliefs about beauty itself will 
be clearer and more reliable than well-founded beliefs about any of the beautiful 
things in the unstable world of experience. This is not necessarily an otherworldly 
or unscientific attitude. Note, for example, that physicists are not interested in any 
electrons in particular: they investigate the laws governing all electrons. These very 
laws set limits to how much information we can acquire about individual electrons. 
We might say, with Plato, that the laws are “more knowable” than the distinctive 
features of the individuals they govern. 

Someone with Platonic inclinations, might embrace numeral-types, the shapes 
of numeral-tokens, as the objects of meaningful mathematics. Hilbert’s remarks, 
however, point more in the direction of numeral-tokens. He says, for example, that 
the objects of meaningful mathematics are present to us in immediate experience 
(unmittelbares Erlebnis).° While this is arguably an apt description of our contact 
with numeral-tokens, it seems not to apply to numeral-types. Our only “experience” 
of the types is “mediated” by our experience of the tokens: we acquaint ourselves 
with the types by looking at the tokens. Furthermore, Hilbert says quite clearly that 
his numerals are not the shapes shared by the instances, they are the instances that 
share the shapes.* Should we conclude, then, that Hilbert considered our numeral- 
tokens to be the objects of his meaningful mathematics? No, that would be far too 
hasty. We have barely begun to make such a case. To mention just one important 
piece of unfinished business, we have not confirmed that Hilbert would accept our 
neat dichotomy: numeral-type/numeral-token. 

It would be best, at this point, to let the professionals work on fine-grained inter- 
pretations of Hilbert. Without worrying too much about what Hilbert thought, let 
us consider what we ought to think. Let us get some practice using the little bit of 
philosophical machinery we have discussed in this chapter. For most of the chapter, 
we have described how numeral-types do or could behave. It has turned out that 
they do or could behave just like the positive integers (1, 2, 3, ...). Now we raise a 
question: would it make sense to insist that our version of arithmetic is also a theory 
of numeral-tokens: a theory about how numeral-tokens do or could behave? Would it 
make sense to say that the arithmetical properties we have attributed to numeral-types 
also apply to numeral-tokens—that numeral-tokens do or could behave just like the 
positive integers? Let us exercise our philosophical muscles by trying to show that 
this would make sense. I claim only that this enterprise will be educational. Whether 
we can really fabricate a viable philosophical position along these lines—well, that 
is something you will have to decide for yourself. 

We encounter a problem right away: our theory says that ‘|’ has only one imme- 
diate successor, yet there are many tokens of ‘| |’ all of which deserve to be called 
immediate successors of the many tokens of *|’. As Hilbert himself would be quick to 


3 See Hilbert [5], p. 162; Mancosu [6], p. 202. 
4 See Hilbert [5], p. 163, footnote 1; Mancosu [6], p. 214. 
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remark, mathematicians have a ready-made response to this problem. It is common in 
mathematics to treat objects as identical when they are really only equivalent in some 
well-defined sense. There are tricks that, in favorable circumstances, allow mathe- 
maticians to do this without any logical difficulties.° If sameness-of-shape is the sort 
of equivalence relation required for this mathematical sleight of hand, then these 
tricks would allow us to use the language of shapes when we are really only talking 
about shaped things.° Our talk about “one shape” would then be interpretable as an 
economical way of talking about the many things so shaped. This would allow us to 


say that the one and only shape ‘| |’ is the unique immediate successor of the shape 
‘!’ even while insisting that the official objects of our theory are numeral-tokens— 
including the many tokens of ‘||’. Or, to put the matter somewhat differently, we 


could insist that each numeral-token has at most one immediate successor, but when 
pressed, we would have to acknowledge that we are using the phrase “at most one” 
in an unusual way. 

Having too few successors may be a more pressing problem than having too 
many. If every numeral-token has an immediate successor, there are infinitely many 
numeral-tokens. But if numeral-tokens are physical, macroscopic objects accessible 
to us through our senses (as I, for one, have been assuming), it is not so clear why we 
should believe they are infinite in number. This is reminiscent of our earlier discussion 
of numeral-types in Sect. 1.3. Even in the absence of compelling evidence that there 
are infinitely many numeral-types, we decided it might be illuminating to explore 
what things would be like if there were infinitely many. What is required for such 
an exploration to be successful? When we set out to explore possible X’s, we mean 
to be discussing the behavior of X’s in situations that can be coherently described. 
If our description would be incoherent no matter what we meant to be discussing, 
we will be trying to pass off as possible something that is logically impossible. If 
our description is incoherent because it is incompatible with our understanding of 
what it is to be an X, then we will be trying to pass off as possible something that 
is conceptually impossible. It appears that the world of numeral-types, as we have 
described it so far, is possible in both senses. At least, we have encountered no 
evidence to the contrary. 

Can we say the same about a world inhabited by infinitely many numeral-tokens? 
Is an infinitude of numeral tokens possible both logically and conceptually? As for 
logical possibility, it is important to remember what our project is right now. We 
are trying our best to show that the [7 , sentences we would normally treat as true 
assertions about the positive integers can be reinterpreted as coherent claims about 
numeral-tokens. In making this case, we should be able to rely entirely on postulates 
that attribute to numeral-tokens certain abstract, mathematically salient properties: 
properties that numeral-tokens could share with positive integers. If we really have 


5 For an especially clear discussion see Burgess [2], pp. 157-161. 

© But is sameness-of-shape the right sort of relation? Equivalence relations of the sort required are 
transitive: if a relates to b and b relates to c, then a relates to c. However, the relation appears- 
to-be-the-same-shape 1s, notoriously, not transitive because imperceptible differences in shape can 
add up to perceptible ones. Does this consideration not apply to our numeral-tokens? 
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a fighting chance of showing that numeral-tokens could behave like the positive 
integers, then our project will probably not require us to emphasize the ways numeral- 
tokens differ from positive integers. So we should be able to reverse the process and 
reinterpret our claims about numeral-tokens as claims about positive integers.’ That 
is, if our theory leads us to say that numeral-tokens have some abstract property, 
then ordinary arithmetic should lead us to say that the positive integers have that 
same property. But, if our theory of numeral-tokens were interpretable in ordinary 
arithmetic, then any absurd consequence of the former theory would be translated 
into an absurd result of arithmetic. If our theory led us to say that numeral-tokens both 
have and lack a certain abstract property (to take one example of a palpable absurdity), 
then ordinary arithmetic would lead us to say that the positive integers both have and 
lack that property. That is, our theory of numeral-tokens would be inconsistent only if 
arithmetic itself were inconsistent.® So we should worry about the logical possibility 
of infinitely many numeral-tokens only if we are already worried about the logical 
possibility of infinitely many positive integers. 

Our second question is whether an infinitude of numeral-tokens is conceptually 
possible. This is a question about our concept of numeral-token and how far we 
can venture while still discussing things that answer to that concept. Suppose it is 
indeed part of our concept that numeral-tokens are macroscopic physical objects. 
Suppose, too, it is part of our concept of physical objects that they are subject to the 
physical necessities prevailing here in the actual world. That would make it concep- 
tually impossible for physical objects to do what is physically impossible. So, before 
we could be confident that an infinitude of numeral-tokens is conceptually possible, 
we would need to be reassured that it is physically possible. That sounds like a job 
for the physicists. While we wait to hear from them, we should note that a favor- 
able verdict from the physicists might still leave us with our philosophical question 
about conceptual possibility: when we discuss imaginary scenarios in which there 
are infinitely many objects that we call “numeral-tokens,” are we really discussing 
numeral-tokens? Even if such scenarios are physically possible, they might conflict 
in some other way with our concept of numeral-tokens or with the concept we decide 
we ought to have after we think about it a while. Even after getting help from the 
physicists, we may need to consider very carefully what we mean by a numeral-token 
before we can make progress.” Though we will not pursue the matter further here, I 


7 A case of the sort I have in mind is discussed in Chap. 3: when we craft a set theory just strong 
enough to supply an interpretation of arithmetic, it turns out that arithmetic supplies an interpretation 
of our set theory. Positive integers are exceptionally powerful devices for coding up all sorts of 
information—information that may or may not have obvious connections to arithmetic. (We got 
a taste of this in the previous section when we used individual numbers to code ordered pairs of 
numbers. There will be more such examples in Chaps. 2 and 3.) It would be surprising if the /7 : part 
of arithmetic were too weak to code up the mathematically salient claims about numeral-tokens that 
we need for our reinterpretation of [7 i sentences. Granted, interpretability does not always work 
both ways. There are theories that interpret arithmetic but are not interpretable in arithmetic. Our 
project, however, does not seem to require us to formulate such a theory. 


8 Section 5.4 of Chap. 5 may help you see how this works. 


° To take an extreme example of the sort of thing I have in mind, suppose we try to imagine what it 
would be like for an elephant to be a pencil. Will we then be thinking about elephants and pencils 
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do hope you will continue to think about it. One of the purposes of this philosophical 
exercise was to inspire you to help the rest of us think about these issues more clearly. 

Some of the above considerations seem not to apply to numeral-types because, 
at least as I understand them, they are not physical objects subject to physical laws. 
(If you think they are physical objects, you should be prepared to face the question 
of where they are. Or are they, perhaps, special physical objects that have no spatial 
location?) Even if you accept that numeral-types are themselves non-physical, you 
might nonetheless think it conceptually impossible for a numeral-type to exist without 
having physical tokens inhabiting our universe. (Your slogan might be: no token-less 
types.) You would then have to concede that the above observations about physical 
possibility do apply to numeral-types, since any conceptually possible scenario in 
which there are infinitely many types would be one in which there are infinitely many 
tokens. But is there really a good reason to believe that numeral-types are subject to 
such a constraint? 

Hilbert’s assistant and collaborator PAUL BERNAYS (1888-1977) did not consider 
either numeral-tokens or numeral-types to be the primary objects of meaningful 
mathematics. Mathematicians, he says, study structures. “Figures” such as 


are of mathematical interest because they exemplify an important structure: the struc- 
ture formed by the first four ordinal numbers. The figures 


0 0’ 0” 0” 
exemplify the same structure. The shared structure 
first second third fourth 


is the real object of mathematical inquiry. An indication of this is the mathematician’s 
disinterest in the “inessential” differences between the two sequences of figures. The 
figures ‘|’ and ‘0’ are certainly different, but here they play the same mathematical 
role. They represent the same position in the sequence of ordinals: the position first. 
It is characteristic of mathematical thought to ignore the differences and focus on 
the figures’ role as representatives of a particular position in an abstract structure.!° 


or about things we just call “elephants” and “pencils”? For a real-life example, consider what 
Geraldine Ferraro said when Barack Obama became the front-runner in the race for the Democratic 
presidential nomination. “If Obama was a white man, he would not be in this position. And if he 
was a woman of any color, he would not be in this position. He happens to be very lucky to be who 
he is. And the country is caught up in the concept.” Can we really think clearly about a situation in 
which Barack Obama (not someone /ike Obama, but Obama himself) is, say, a Polynesian woman? 
As our imaginary excursions become more and more fanciful, could we not reach a point where we 
are not really talking about Obama? 


!0 See Bernays [1], pp. 338-341; English translation in Mancosu [6], pp. 243-245. 
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We have been considering whether we can supply IT ‘ sentences with genuine 
content by interpreting them as claims about concrete objects of sensory experience: 
that is, numeral-tokens. Bernays, as we just saw, considered this approach strangely 
out of step with the characteristically mathematical way of thinking. KURT GODEL 
(1906-1978) offers another reservation. When we say that [7 ? sentences, interpreted 
in a certain way, have content, does that claim itself have content? In our role as 
cheerleaders for numeral-tokens, we might say “yes” because we hope to interpret 
this assertion about JT sentences as a claim about concrete objects of sensory expe- 
rience: namely, [7 f sentence-tokens. However, if asked to explain what makes a 
sentence IT y we would have to invoke “the general principle of primitive recursive 
definition” which “contains the abstract concept of function.”!! Since, according 
to Gédel, functions are not concrete objects of sensory experience, either (1) state- 
ments about abstract objects can have content or (2) we have not really managed 
to say which sentences have content. In the first case, we have no reason to insist 
on a concrete interpretation of [7 p sentences (though we might still consider this 
an option worth pursuing). In the second case, when we appear to be describing a 
distinctive philosophical or methodological position, we are actually saying nothing. 


1.10 Solutions of Odd-Numbered Exercises 


1.1 Suppose o (xy). Pick x’, y’ such that t(x’) = T(x), T(y’) = TQ), VOY’). 
Now suppose o (zy). Pick z’, y” such that t(z’) = t(z), tT(y”) = T(y), U(z'y”). 
Then t(y’) = t(y”) and, hence, by Proposition 1.2, t(x’) = T(z’). So T(x) = T(z). 
Now go the other direction. Suppose t(x) = T(z), still assuming that o (xy). Then 
t(x’) = T(z) and, hence, a numeral-token of the same type as y (namely y’) is the 
result of adding one tally mark to a numeral-token of the same type as z (namely x’). 
That is, o (zy). 


1.3 Barack Obama has two children: Malia and Sasha. According to our definition: 


child(the father of Malia) = Malia 
child(the father of Sasha) = Sasha. 


Furthermore: 
the father of Malia = the father of Sasha. 
So: 
Malia = child(the father of Malia) = child(the father of Sasha) = Sasha. 


The problem is that children with the same father can be distinct, whereas numeral- 
types with the same successor cannot. 


'l See Gédel [4], p. 272, footnote b. 
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1.5 


I—l11 =|—S(S()) 
= pred(|—S(\)) 
= pred(pred(|—|)) 
= pred(pred(pred(|))) 
= pred(pred(\)) 
= pred(\) 


1.7 Definition 1.4 and Theorem 1.1 confirm that the equation is true when a = 
Our inductive hypothesis is that 


YS +b=b64~. 
By Definition 1.4, Theorem 1.2, and our inductive hypothesis, 
SW)+6=SW~ +6) =S(64+~)=64+ S(~). 
So the desired property passes from ~ to S(~). 
1.9 By Definition 1.6, 


a—(6+ |) =a—S(b) = pred(a—b) = (a—b)—|. 


So the equation is true when c = |. Our inductive hypothesis is that 
a—(6+~) = (a—b)—~. 
By Definitions 1.4 and 1.6, 
a—(b + S(~)) =a—S(6+ ~) = pred(a—(b+~)). 


By Definition 1.6 and our inductive hypothesis, 


pred(a—(b +~)) = pred((a—-6) —) = (ab) S(<). 


So 
a—(6+ S(~)) = (a—b)—S(~). 


That is, the desired property passes from ~ to S$(~). 


1.11 By Definition 1.4 and Exercises 1.7 and 1.8, 
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S(a)—a=(a+|)—a=(|+a)—a=|. 


1.13 It follows from Definition 1.7 and Exercise 1.7 that id(a, 6) = id(6, a). By 
Exercise 1.12, 


II|| — (S(a) —b) + (S(b) —a)) = III] —(S(S(a)) — S(b)) + (S(S(b)) — S(a))). 


1.15 Our proof is by induction. Note, first, that Definitions 1.5 and 1.6 and Exercises 
1.7, 1.9, and 1.10 give us: 


II]-@+)=IIl-d+4) = (Il-)D—-a=||-a=|. 
By Definitions 1.4, 1.5, and 1.6 and an inductive hypothesis, 
[I -G@+ S(~)) = |I| —S(a+ ~) = pred(|||—(@@+ ~)) = pred(|) = |. 


1.17 By Exercises 1.9, 1.10, and 1.11, 


S(a)— (a+ 6) = (S(a) —a) —b =|—b =|. 
By Definition 1.4 and Exercises 1.7 and 1.8, 
(S(a+ 6) —a) = (a+ S(b)) —a = (S(6) + a) —a = S(b). 


So Definitions 1.4, 1.6, and 1.7 and Exercises 1.9, 1.10, and 1.15 let us reason as 
follows 


id(a,a+b) = ||||—( + S(6)) 
= (Ill —D—S(®) 
= ||| —S(6) 
= |[|—@+) 


=|. 
1.19 As an inductive hypothesis, we assume that 
id(a, ~) + id(id(a, ~), |) = ||. 
We want to show that 
id(a, S(~)) + id(id(a, S(—)), |) = Ill. 
To that end, we begin a second induction (this time on a). Exercises 1.13 and 1.18 


guarantee that 
id(|, S(~)) + idGid(|, S(~)), |) = III 
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Exercise 1.13 and our earlier inductive hypothesis let us reason as follows 
id(S(Q), S(~)) + id(id(SO), S(—)), |) = id, —) + ididO, ~), |) = III. 


This completes our induction on a (an odd one without its own inductive hypothesis). 
So we have confirmed that 


id(a, S(~)) + id(id(a, S(~)), |) = Ill 
This completes our first induction. 


1.21 We can use Definition 1.4 to confirm that |||| = || + ||. So, by Exercises 1.13 
and 1.17, id(||||, ||) = | and, hence, by Exercise 1.16, id(id({|||, ||), |) = ||. Defini- 
tion 1.4 implies that || = | + |. So, by Exercises 1.13 and 1.17, id@d(id(||||, |), [), ) 
= |. That is, -(—(—([Ill = |). 


1.23 An adjective is heterological if and only if it does not apply to itself. For example, 
the adjective ‘monosyllabic’ is heterological because it is not monosyllabic. What 
about the adjective ‘heterological’? If it is heterological, then it applies to itself and, 
hence, is not heterological. If it is not heterological, then it does not apply to itself 
and, hence, is heterological. So if it either is or is not heterological, then it both is 
and is not heterological. That is, the instance of LEM we are considering yields an 
outright contradiction. 


1.25 


II 
Dd iaG ID = ia, ID + E4011 ID 
i=| 
=|+| 


1.27 Here is the truth table for conjunction. 


ov Aw 
YT T T 
T F F 
F T F 
F F F 


The idea is that a conjunction is true if and only if both its components are true. 
If I assert “d and w,” I am asserting that @ and y are both true. Now consider the 
following table. 

If I assert “aw = 6 and y = 4,’ I am asserting that id(a, 6) and id(y, 4) are 
both || and, hence, that id(a, B) +id(y, 4) is ||||. Definition 1.13 guarantees that the 
conjunction of two equations is true if and only if both equations are true. 
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a=B y=5 id(a,B) id(y,5) id(a,B)+id(y,6) a=B A y=Sd 
T l| l| HII 


F II | II| 
T | II II| 
F 


mama 
mmm 4 


1.29 Here is one example. 


id(id(| +|,|)),|—@) =I. 
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Chapter 2 
Peano Arithmetic, Incompleteness 


2.1 The language of PA 


We have seen that some profound claims about the natural numbers are [7 a On the 
other hand, the vocabulary of number theory allows us to formulate sentences that 
are not IT te Hilbert would say that at least some of these sentences lack content. 
We are going to consider a theory replete with such sentences. The theory is Peano 
Arithmetic (PA), a formalized version of elementary number theory.! A formal theory 
needs a formal language. The language of PA has the following vocabulary. 


. Two CONNECTIVES: ‘—’ (“not”), ‘>’ (“if ...then’’). 

. A QUANTIFIER: ‘V’ (“for all’’). 

. The IDENTITY symbol: ‘=’. 

Three FUNCTION symbols: ‘S’ (“successor”), ‘+’ (“plus”), *-’ (“times”). 
One PROPER NAME: ‘0’ (“‘zero’’). 

. Infinitely many VARIABLES: ‘w’, ‘x’, ‘y’, ‘z’, ‘wi’, ‘x1’, ‘YL, ‘ZI, 

. Two PARENTHESES: ‘(’, ‘)’. 


NDAWNRWNR 


In everyday English, certain expressions refer to individual things or can so refer 
in appropriate contexts. Examples include proper names (‘Kurt Gédel’), pronouns 
(‘him’), and descriptions (‘the second son of Marianne Gédel’). The TERMS of PA are 
the expressions that play this role in our formal language. We define them recursively 
as follows. 


8. ‘0’ is aterm. 
9. Every variable is a term. 
10. If a and G are terms, then so are’ Sa'', "(a+ @)', and "(a- B)1. 


In everyday English, declarative sentences declare that something is the case 
(‘Kurt Gédel was smart’) or could, if placed in appropriate contexts, declare that 


y Although the ‘P’ in ‘PA’ commemorates the mathematician GIUSEPPE PEANO (1858-1932), 
another mathematician, RICHARD DEDEKIND (1831-1916), deserves much of the credit. For some 
of the history, see Wang [11] (http://www.jstor.org/stable/2964176). 
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something is the case (‘He was smart’). The FORMULAS of PA are the expressions 
that play this role in our formal language. We define them recursively as follows. 


11. If a and @ are terms, then "a = (1 is a formula. 
12. If ¢ and w are formulas, then so are" —@ | and" (¢ > wW)". 
13. If a is a variable and ¢ is a formula, then Va @' is a formula. 


Symbols are said to OCCUR in terms and formulas. For example, the function 
symbol ‘+’ occurs in the term ‘((x + y) + z)’. In fact, it occurs twice: it has two 
OCCURRENCES. The left-hand parenthesis ‘(’ also occurs twice, as does the right-hand 
parenthesis ‘)’. The variables ‘x’, ‘y’, and ‘z’ each occur once. 

The symbols ‘"’ and *” are known as CORNER QUOTES.” If a is a term, then 
"Sa ‘is the term consisting of an occurrence of the function symbol ‘S’ followed by 
an occurrence of the term a. If a is ‘0’, then" Sa is ‘SO’. Other corner quotations 
are to be understood similarly. Note that ordinary single quotation marks would not 
do what we intend here. For example, ‘Sa’ is not a term of PA even if a is. This 
is because ‘a’ is not a term of PA. ‘a’ is an expression we use to discuss the terms 
of PA. To take a less technical example: ‘David Hilbert’ is an expression we use to 
designate a certain mathematician. To say that ‘David Hilbert’ was a mathematician 
is to say that David Hilbert’s name was a mathematician. As far as we know, no 
mathematician has ever been David Hilbert’s name (Such an arrangement is unlikely 
to be practical). 


Exercise 2.1 Suppose a is a PA-term. Is "aS' a PA-term? Is" Sa! a PA-term? 
Does a occur in’ Sa'? Does a occur in ‘Sa’? Does ‘a’ occur in’ Sa'? Does ‘a’ 
occur in ‘Sa’? 


The SUB- FORMULAS of a formula 7 are w itself and any formulas that occur in 
w. For example, the formula 


WxVy (x + Sy) = (x + y) + SO) 
has three sub-formulas. They are 
(x + Sy) = (a+ y) + SO), 
Vy (+ Sy) = (@ + y) + $0), 
and the whole formula itself. Any occurrence of a variable a within a sub-formula 
of w of the form" Va ¢@1is BOUND in w. All other occurrences are FREE in w. There 


are three bound occurrences of ‘x’ in 


VxVy (x + Sy) = (@ + y) + SO). 


? See Quine [9, pp. 33-37]. I will soon get sloppy and stop using the corner quotes (Absolute rigor 
does get a bit tedious). I thought, however, that you should know about them. 


2.1 The language of PA By 
There are two free occurrences of ‘x’ in 
Vy @ + Sy) = (a+ y) + SO). 


A free variable is like a pronoun without an antecedent. The expressions ‘He is short’ 
and ‘x = 0’, outside of any appropriate context, make no definite claims and, so, are 
neither true nor false. (Who is he and what is x?) 


Exercise 2.2 Identify the bound occurrences of ‘x’, ‘y’, and ‘z’ in the following 
formula. 


VyVz(-Vx — (z= Sy > z=x) > (x+y) = S(x - SSO)) 


In our formal language, we understand SENTENCES to be formulas with no free 
occurrences of variables. This is a departure from everyday usage where ‘He is 
short’ counts as a declarative sentence even in contexts where the pronoun ‘He’ fails 
to refer to anyone and, so, is behaving like a free variable. Each sentence of PA 
makes a definite claim as soon as we supply PA with an interpretation (See Sect. 2.3 
below). 


2.2 The Axioms of PA 


We will now use our formal language to describe the natural numbers (0, 1, 2, 3, ...). 
Actually, we will only describe how the natural numbers are related to one another: 
we will only describe the structure formed by the natural numbers or (speaking more 
circumspectly) the structure they would form if they were to exist. We do this by 
accepting certain sentences, the AXIOMS, without proof. We show that a sentence 
is a THEOREM by showing that it follows from the axioms. PA has two axioms 
characterizing the successor operator. The first says that 0 is not the immediate 
successor of any natural number (“Given any natural number x, it is not the case that 
0 is identical to the immediate successor of x”). The second says that natural numbers 
with the same immediate successor are the same (“Given any natural numbers x and 
y, if the immediate successor of x is identical to the immediate successor of y, then 
x is identical to y”). 


Sl Vx —O= Sx 
S2 VxVy(Sx = Sy > x = y) 


PA has two axioms giving the recursive definition of addition and another two 
offering the recursive definition of multiplication. 


Al Vy (y+ 0)=y 
A2 WxVy (y + Sx) = S(y +x) 
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MI Vy (y-0) =0 
M2 YxVy (y- Sx) = ((y-x) + y) 


Finally, PA has infinitely many induction axioms, one for each formula with free 
occurrences of ‘x’ and ‘y’. That is, if é(x, y) isa formula of PA with free occurrences 
of ‘x’ and ‘y’, but no free occurrences of any other variable, if 6(0, y) is the result 
of replacing each free occurrence of ‘x’ in }(x, y) with an occurrence of ‘0’, and 
if d(Sx, y) is the result of replacing each free occurrence of ‘x’ in d(x, y) with an 
occurrence of ‘Sx’, then 


"Wy(bO, y) > (Wx(P, y) > P(Sx, y)) > Vx OC, y)))! 
is an axiom of PA. 
An example may clarify the connection between our induction axioms and the 
proofs by induction we did in Chap. 1. We can let (x, y) be any PA-formula with 


free occurrences only of ‘x’ and ‘y’. If we let d(x, y) be ‘(y + x) = x’, we obtain 
the axiom 


Vy((y + 0) = 0 > (Wx((y +x) = x > (y + Sx) = Sx) > Vx (y + x) = x)) 
This says the formula 
(yy +0) =0 > (Way +x) = x > (y + Sx) = Sx) > Vx (y +x) = x)) 


will be true no matter what y is. So, in particular, we are free to replace each free 
occurrence of ‘y’ with an occurrence of ‘0’: 


(0+0) =0—> (Wx((0+x%) =x > (04 Sx) = Sx) > Vx (0+ x) = x)) 
This sentence says that you can show 
Vx (O+x)=x 

by first showing 

(0+ 0) =0 
(to get through the first ‘—’) and then showing 

Vx((O+x) =x > (0+ Sx) = Sx) 

(to get through the last ‘—’). The goal here is to show that each natural number x is 
identical to 0 + x. The first step is to show that 0 has this property: that 0 is identical 


to 0 + 0. The next step is to show that this property is hereditary: that each natural 
number passes it on to its successor. You might think of the left-hand part of the 
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formula 
(O+x) =x > (0+ Sx) = Sx) 


as an inductive hypothesis, while the right-hand part is the result to be obtained from 
that hypothesis. 


Exercise 2.3. Using Al, A2, and the induction axiom we have been discussing, prove 
informally thatVx (0+ x) = x. say “informally” because I have not introduced a 
formal deductive system of the sort you may have studied in a logic course. You still 
need to make good inferences when you do this exercise, but you can just go ahead 
and make those inferences without citing any explicit inference rules). 


2.3 Incompleteness 1: Compactness 


As I mentioned above, a theorem of PA is a sentence in the language of PA that 
follows from the axioms of PA. A PROOF in PA is a demonstration that a PA-sentence 
does follow from the PA-axioms. A FORMAL LOGIC for PA will include a definition of 
‘proof’ precise enough to yield a mechanical procedure for telling whether something 
is a proof. The logic of PA, CLASSICAL FIRST- ORDER LOGIC, was first formalized 
by GOTTLOB FREGE (1848-1925).° There are a variety of formalizations equivalent 
to Frege’s. Assume we have picked one and suppose J” is a set of sentences. 


Definition 2.1 7% is DERIVABLE from I” if and only if there is a formal proof whose 
conclusion is ~ and whose premises are members of I”. If w is derivable from I”, we 
write: I” | w. If w is derivable from axioms of PA, we write: PAF w. 


We could have characterized the axioms and theorems of PA without offering 
any hints about what those axioms and theorems say. This leaves us free to supply 
various readings or interpretations. In an INTERPRETATION of PA, we (1) specify the 
range of our bound variables, (2) assign an object from that range to ‘0’, and (3) 
assign operations defined on that range to each of ‘S’, ‘+’, and ‘-’. If we are not 
feeling too adventurous we might (1) let our bound variables range over the natural 
numbers (so that we read ‘Vx’ as “for all natural numbers x”), (2) let ‘0’ be our 
name for zero, and (3) assign the operations of immediate succession, addition, and 
multiplication to ‘S’, ‘+’, and ‘-’. This, after all, is the intended interpretation of PA. 
You will perhaps agree that all the axioms of PA come out true when so interpreted. 
An interpretation that makes all the axioms of PA true is said to be a MODEL of PA. 

Here is an alternative interpretation. Let our bound variables range over zero and 
all the negative integers (so that we read ‘Vx’ as “for all non-positive integers x’’). Let 
‘0’ be our name for zero. Assign the operation “minus one” to ‘S’ and the operation 
of addition to ‘+’. Read ‘x - y’ as “x times y times negative one”. You might take a 


3 See Frege [2]; English translation in van Heijenoort [10, pp. 5-82]. If you have had a logic course, 
you have probably worked with a close cousin of Frege’s system. 
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minute to confirm that all the axioms of PA come out true when interpreted in this 
way (So this is a model of PA). 


Exercise 2.4 Let our bound variables range over all the even natural numbers 
(including zero). Complete this interpretation in a way that makes the first six axioms 


of PA true.4 


I already noted that a proof is a demonstration that a sentence (the conclusion) fol- 
lows from some sentences (the premises). I did not, however, define “follows from”. 
That was probably fine: you probably already understood this sort of “following” 
well enough to make sense of the preceding discussion. Insisting on a definition of 
everything is a sure-fire way to block intellectual progress. Sometimes, however, 
a mathematical definition of a notion that is already well understood is a way of 
drawing that notion into the mathematical realm: making it the object of productive 
mathematical inquiry. That is the goal of the following definition. 


Definition 2.2 4% FOLLOWS FROM J” if and only if it is logically impossible for an 
interpretation to make all the members of I" true and ~ false. If / follows from I’, 
we write: I” F w. If w follows from axioms of PA, we write: PA F w. 


OK, we still have work to do: we do not yet have a mathematical definition 
of logical impossibility. Supplying such a definition is one of the most important 
jobs of set theory (You might think of set theory as an inventory of logically possible 
structures). We will leave a serious exposition of set theory for later chapters. For now, 
we will try to make do with a less systematic understanding of logical impossibility. 
Even without a full-blown theory, we have a good idea of how to establish that 
a situation is logically impossible: assume that the situation is real and derive a 
contradiction from that assumption. 

There is another bit of work we will leave undone: we will not offer a mathemati- 
cally precise account of what our logical symbols mean. This can be done (You may 
have already seen it done in a logic course. You saw a bit of it done in Chap. 1). We 
will just not be doing it here. If, in what follows, you need to figure out whether it is 
logically possible for an interpretation to make certain sentences true or false, you 
will have to draw on your rough-and-ready understanding of those sentences. I hope 
itis clear, for example, that no interpretation can make ‘0 = 0” false since that would 
require that the object assigned to ‘0’ be distinct from the object assigned to ‘0’. 

It turned out to be mathematically fruitful to give a mathematical definition of the 
“follows from” relation. In his doctoral dissertation of 1929, for example, Kurt Gédel 
proved that Frege’s formalization of first-order logic is COMPLETE: any conclusion 
that follows from a set of premises is derivable from that set.° 


Theorem 2.1 IF w only if Fw. 


4 More modestly, we want an interpretation that would make the first six axioms true if there were 
such things as the natural numbers. This sort of conditional claim is what I will generally intend 
when I talk about an interpretation making certain sentences true. 


5 Gédel’s dissertation is reprinted, with an English translation, in Gédel [7, pp. 60-101]. 
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Frege’s formalization of first-order logic is also SOUND: any conclusion that is 
derivable from a set of premises follows from that set. 


Theorem 2.2 IF w only if IF wy. 


Exercise 2.5 Formal proofs are all finite: they consist of finitely many lines featuring 
only finitely many premises. Use this fact, and the preceding two theorems, to prove 
Gédel’s COMPACTNESS THEOREM: if a formula follows from I’, it follows from a 
finite subset of I’. 


Exercise 2.6 Suppose it is logically possible for PA to have a model. Show that 
PA 040. 


Exercise 2.7 Suppose PA 4 0 4 0. Show that it is logically possible for PA to have 
a model. 


We are going to see that PA suffers from a particular form of incompleteness. First, 
though, we need to reflect on the relationship between numbers and numerals. The 
NUMERALS of PA are ‘0’ and any terms consisting of an occurrence of ‘0’ preceded 
by finitely many occurrences of *S’. That is: 


‘0’, ‘SO’, ‘SSO’, “SSSO’, ... 


We understand the natural numbers to be 0 and everything obtainable from 0 
by finitely many applications of the successor operation S. So it follows from our 
conception of the natural numbers that each of them is named by a numeral of PA 
when these numerals are interpreted in the standard way (with ‘0’ naming O and 
“S’ expressing the successor operation). For example, the number obtained from 0 
by 15 application of S is named by the numeral consisting of an occurrence of ‘0’ 
preceded by 15 occurrences of *S’. The STANDARD numbers are 0 and all the numbers 
obtainable from 0 by finitely many applications of S: that is, exactly the numbers 
named by our numerals. It is part of our concept of the natural numbers that each of 
them is standard. So, if PA completely captures our concept of the natural numbers, 
the axioms of PA will rule out non-standard numbers: that is, it will be logically 
impossible for an interpretation to make all the axioms of PA true while including 
in the range of PA’s bound variables an object that is not named by any numeral 
of PA. 

At this point it is useful to introduce two new symbols: the proper name ‘c’ and 
the inequality symbol ‘4’. We use ‘4’ to make our claims about non-identity a bit 
more readable: 


af# 6 => -a=8. 


You might think of ‘c’ as the name of a natural number (though I have not said 
which one). Now consider the following sequence of sentences. 


c#0, c¥#S0, cASSO, c ASSSO, 
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Let C be the set of all these sentences. Each member of C says something com- 
patible with our conception of the natural numbers. For example, 


c # SSSSSO 


makes the innocent claim that a certain (so far unidentified) natural number is dis- 
tinct from five. Given any model of PA, we could make the above sentence true by 
assigning to ‘c’ an object in the range of our bound variables. Just let c be anything 
other than the object named by ‘SS'SSSO’. 


Exercise 2.8 Suppose C* is a proper subset of C. That is, every member of C* is a 
member of C, but not every member of C is a member of C*. Show the following: 
given any model of PA, we can make all the members of C* true by assigning to ‘c’ 
an object in the range of our bound variables. You may assume the following: If a 
and 2 are PA-numerals and if a model of PA makes the equation’ a = 3" true, then 
a = 2. (Can you see why this is so?) 


Let PA U C be the set consisting of the axioms of PA and the members of C. 
An interpretation of PA U C is an interpretation of PA that, in addition, assigns to 
‘c’ an object in the range of PA’s bound variables. We now adopt a premise for a 
conditional proof: that is, an argument intended to establish an “if ...then” statement. 
We need not believe that this premise is true. Our project is to see what follows 
from it. 


Premise (for conditional proof): | PAcompletely captures our concept of the natu- 
ral numbers and, so, rules out non-standard numbers. 


This would mean that a model of PA U C in which c behaves like a non-standard 
number is logically impossible. On the other hand, if we make all the members of 
C true, c will have to be non-standard, since c will be distinct from every number 
named by a numeral of PA. So our premise implies that it is logically impossible for 
an interpretation to make all the members of PA U C true. This means that 


PAUCEW 


no matter what w is. (Check the definition of ‘follows from’ to confirm this.) For 
example, 


PAUCE0 £0. 


So, by the Compactness Theorem, ‘0 4 0’ follows from finitely many members 
of PAU C. We may suppose, then, that 


PA*UC* E040 


where PA* is a finite set of PA-axioms and C* is a finite subset of C. Any interpre- 
tation that makes all the members of PA true will make all the members of PA* true 
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and can easily be extended to make the finitely many members of C* true (as you 
showed in the last exercise). Such an interpretation would have to make ‘O 0” true 
since ‘0 4 0’ follows from PA* U C*. But that is logically impossible. (It is logically 
impossible for ‘0’ to name an object that is not the same object as itself.) So there can 
be no such interpretation. That is, PA is UNSATISFIABLE: it is logically impossible 
for an interpretation to make all the axioms of PA true; it is logically impossible for 
PA to have a model. So 


PAF 040 
and, hence, by the Completeness Theorem, 
PAF 040. 


That is, PA is INCONSISTENT.° We reached this conclusion by first supposing that 
PA completely captures our concept of the natural numbers. So our grand conclu- 
sion is: PA completely captures our concept of the natural numbers only if PA is 
inconsistent. 

If PA is consistent, it is INCOMPLETE. (It is a bit ironic that the incompleteness of 
PA follows from Gédel’s completeness theorem.) What is at issue here is expressive 
incompleteness. There are mathematically important properties of the natural num- 
bers that cannot even be expressed in the language of PA. Even if we use infinitely 
many sentences of PA, we cannot assert that every natural number is standard nor, 
to take another example, can we assert that each natural number is greater than only 
finitely many natural numbers. 


Exercise 2.9 Sketch a compactness argument showing that PA allows for infinitely 
large numbers. It might be helpful to start by using the vocabulary of PA to define 
the “less than” symbol ‘<’. Feel free to define any other vocabulary you find useful. 
Feel free, too, to assume that PA is consistent. You may, if you wish, fill in all the 
details of the argument, but I am only asking for the general idea. 


2.4 Incompleteness 2: Representability 


In the intended interpretation of PA, the PA-NUMERAL for a natural number n con- 
sists of an occurrence of ‘0’ preceded by n occurrences of ‘S’. When we use the 
informal variable ‘n’ to make general claims about natural numbers, we can use the 
informal variable ‘n’ to make general claims about the corresponding PA-numerals. 
For example, we might say: if m is a natural number, then the PA-numeral n consists 
of an occurrence of ‘0’ preceded by n occurrences of ‘S’. 


© Here are three equivalent definitions of inconsistency: an inconsistent theory is one that proves 
a logical absurdity such as ‘0 # 0’; an inconsistent theory is one that proves a sentence ¢ and its 
negation —@; an inconsistent theory is one that proves every sentence in its language. 
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Suppose f is a one-place, primitive recursive, characteristic function. By “prim- 
itive recursive”, we mean essentially what we meant in the preceding chapter with 
any modifications necessary to accommodate the number zero. A one-place function 
takes natural numbers one at a time as inputs. A characteristic function has only two 
possible outputs: zero and one. 

Suppose ¢(x) is a PA-formula with free occurrences of ‘x’ and no free occurrences 
of any other variable. Then (x) is said to REPRESENT /f in PA if it has the following 
properties. If n is a natural number and f(n) = 1, then PA proves the sentence that 
results when we replace every free occurrence of ‘x’ in (x) with an occurrence of 
the PA-numeral for n. That is: 


fin) =1 => PAF df). 


If n is a natural number and f(n) = 0, then PA proves the sentence that results 
when we replace every free occurrence of ‘x’ in —@(x) with an occurrence of the 
PA-numeral for 1. That is: 


f(a) =0 = PAF —-¢(M). 


If f is represented in PA by a PA-formula, then we say that f is REPRESENTABLE 
in PA. 

Each characteristic function answers a yes-or-no question about the natural num- 
bers. It answers “yes” by returning the value 1. It answers “no” by returning the value 
0. (In Chap. 1, || was “yes”, while | was “no”.) Suppose the characteristic function 
f is represented in PA by the formula ¢(x). By feeding f a number n, we can learn 
whether PA thinks 7 has the property expressed by $(x). f will answer “yes” only 
if PA proves ¢(n). f will answer “no” only if PA refutes p(n). 

Here is an example. We can infer from Definition 1.14, that there is a primitive 
recursive characteristic function D that behaves as follows. 


1 if nisodd 
0 otherwise. 


O(n) = 


tells us whether the numbers we feed it are odd. If we want to find a PA-formula 
that represents f, we should look for one that somehow expresses the property of 
oddness. If O were represented in PA by ¢(x), then PA would prove each of the 
sentences 


0(S0), O(SSS0), d(SSSSSO), O(SSSSSSSO),... 
and would disprove each of the sentences 


d(0), o(SS0), d(SSSSO0), d(SSSSSSO),... 
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That is, PA would prove #(n) whenever 7 is odd and would prove —¢(n) whenever 
n is even. Can we identify such a formula ¢(x)? Well, does the vocabulary of PA 
allow us to say that a number x is odd? Of course it does. We just say that x is not a 
multiple of two: 
Vy (y- SSO) A x. 


As it turn out, this formula really does represent f because PA proves each of the 
sentences 


Vy (y- SSO) 4 SO, Vy (y - SSO) A SSSO, Vy (y- SSO) A SSSSSO, ... 
and disproves each of the sentences 
Vy (y- SSO) 40, Vy (y- SSO) A SSO, Vy (y - SSO) A SSSSO, ... 
So, when we offer f the number n and f responds “yes”, we do not just learn 
that n is odd: we learn that PA thinks n is odd. When we offer f the number n and f 
responds “no”, we do not just learn that 1 is even: we learn that PA thinks n is even. 


Here is another example. We know from Definition 1.15, that there is a primitive 
recursive characteristic function $8 that behaves as follows. 


_ | 1. ifn isprime 
a 0 otherwise. 
Exercise 2.10 Identify a PA-formula that represents 3B in PA. 


It is no fluke that D and $B are representable in PA. Godel proved in 1930 that all 
of our primitive recursive functions f are representable in PA.’ So if 


f@=1 


is a true J7 , sentence, there is a formula ¢(x) that represents f in PA and, so, PA 
proves each of the sentences 


(0), H(S0), d(SSO), O(SSSO), ... 
Since PA does not guarantee that 
0, SO, SSO, SSSO, ... 


are the only objects in the range of its bound variables, we should hesitate to infer 
that PA will prove 


7 See Gédel [6]; English translation in Gédel [7, pp. 145-195], and van Heijenoort [10, pp. 596-616]. 
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Vx (x). 


Indeed, Gédel identified primitive recursive functions f represented by 
PA-formulas #(x) such that, if PA is consistent, it is true that f(a) = 1, but Vx d(x) 
is not provable in PA. That is, Gédel actually identified formulas ¢(x) with the 
following odd property. PA proves each of the sentences 


(0), o(S0), d(SS0), d(SSSO), ... 


and, so, confirms that each standard number satisfies d(x). Yet, if PA is consis- 
tent, PA is unable to confirm that every number, every object in the range of its 
bound variables, satisfies (x) (If PA is inconsistent it “confirms” everything: every 
PA-sentence is a PA-theorem). 

Suppose, on the other hand, that f(a) = 1 and PA F —Vx (x). Then, by 
Theorem 2.2, PA F —Vx (x) and, hence, every model of PA makes —Vx (x) 
true (since no interpretation makes all the PA-axioms true and —Vx ¢(x) false). So 
every model of PA makes Vx ¢(x) false. However, since ¢(x) represents f in PA, 
Theorem 2.2 implies that every model of PA makes each of the sentences 


0(0), o(S0), d(SS0), d(SSSO), ... 


true. This would mean that every model of PA includes a non-standard number in 
the range of PA’s bound variables (a number that makes Vx ¢(x) false). That is, 
PA would have no STANDARD MODELS and, hence, would be logically incompatible 
with our concept of the natural numbers. Our grand conclusion: if PA is compatible 
with our concept of the natural numbers, then PA does not disprove Vx $(x). 

Suppose, now, it is logically possible for PA to have a standard model (a model in 
which every object in the range of PA’s bound variables is named by a PA-numeral). 
Then Godel has shown us how to identify primitive recursive functions f represented 
by PA-formulas $(x) such that (1) it is true that f(a) = 1, but (2) Vx (x) is neither 
provable nor refutable in PA. PA-sentences that are neither provable nor refutable 
in PA are said to be UNDECIDABLE in PA. We shall now consider what sort of an f 
would allow us to establish undecidability. 

We first note that we can use natural numbers to code formulas of PA. Let’s code 
some of the symbols of PA as follows. 


Now consider the PA-sentence ‘—0 = 0’. To code this sentence, first replace each 
symbol with its code: 
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Now take the product of the first four prime numbers raised to the powers 
1, 8, 4, 8: 


Dt se 3? eS 7", 
This number 
47,278,574,201,250 


is the GODEL NUMBER of the PA-sentence ‘—0 = 0’.8 Working the other direction, 
if you were presented with the number 47,278,574,201,250, you could ask your 
calculator or computer to factor it. Then, noting the exponents and consulting our 
table of symbol codes, you could reconstruct the PA-sentence ‘—0 = 0’. 


Exercise 2.11 Decode the Gédel number 2,349, 101,964,825,000. 


Exercise 2.12 Decode the Gédel number 
96,860,719,328,790, 117,762, 174,283,536,741,369,039,814,728,960,000. 


We can use Gédel numbering to code sequences of PA-formulas. For example, if 
@ and w are PA-formulas with codes #¢ and #w, we can let 


BHO 5 gre 


code the ordered pair < ¢, y >. (We already used this trick in Chap. 1.) We can use 
similar techniques to code formalized PA-proofs. Suppose we have done so. Gédel 
figured out how to identify a primitive recursive function g, represented in PA by a 
PA-formula +(x), that behaves as follows.? 


8 The corresponding PA-numeral consists of a single occurrence of ‘0’ preceded by 
47,278,574,201,250 occurrences of ‘S’. If you were to produce a token of this numeral, it would 
be about 100 million km long. That is about two thirds the distance from the Earth to the Sun or 
about 2,500 times the circumference of the Earth. This suggests that it may be naive to think of 
PA-numerals as actual physical objects. 


° For a readable discussion of Gédel’s construction, see Nagel and Newman [8]. Another helpful 
resource on this and other issues of interest to us is George and Velleman [5]. It might help you wrap 
your brain around Gédel’s proof if you read 7(n) as “‘n does not code a PA-proof of G” where G 
is a certain extra-special sentence of PA. Then Vx 7(x), the universal generalization of y(n), says 
that no natural number codes a PA-proof of G. Now it so happens that Vx (x) is G. So G says 
of itself that it is not provable in PA. A PA-proof of G would prove that G is not provable in PA: 
a strange situation, to say the least. Of course, all this is a bit sloppy. y(n) does not say anything 
unless we interpret it. Furthermore, under the intended interpretation it does not say anything about 
PA-proofs: it only refers to natural numbers. But the road to clarity is sometimes paved with slop. 
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1 if n does not codea proof in PA of Vx y(x) 
gin) = ; 
0 otherwise. 


Since y(x) represents g in PA we also know: 
giz) = 1 => PAF y(n) 
gin) =0 = PAF -7(M). 


If we offer g the number n and g responds “yes”, then n does not code a PA-proof 
of the generalization Vx y(x), but PA does prove the instance y(n). The other case 
is more interesting: if we offer g the number 7 and g responds “no”, then 1 codes a 
PA-proof of Vx y(x) and PA proves —7(n). But it is not coherent to say that every 
number has a certain property and, also, that a particular number does not. So if 
g were ever to say “no”, that would mean PA is inconsistent. Let us run through 
this argument more carefully. Suppose PA + Vx y(x). Then we can pick a natural 
number k that codes a PA-proof of Vx (x). Note that g(k) = 0. So, since (x) 
represents g, PA + —7(k). But, since Vx (x) is derivable in PA, PAF (k). That 
is, PA is inconsistent. Our conclusion: if PA is consistent, then PA does not prove 
Vx (x). Suppose, now, it is logically possible for PA to have a standard model. 
Then, by Exercise 2.6, PA is consistent. So no natural number codes a PA-proof of 
Vx y(x) and, hence, the /T ?-sentence 


g(a) = 1 
is true. So PA proves each of the sentences 
(0), (SO), 7(SSO), 7(SSSO), ... 


and, hence, by our earlier reasoning, PA does not disprove Vx (x) (since, otherwise, 
every model of PA would have to feature a non-standard number witnessing to the 
falsehood of Vx y(x)). Our grand conclusion: if it logically possible for PA to have a 
standard model, then Vx y(x) is undecidable (neither provable nor refutable) in PA. 

We earlier saw that PA suffers from a kind of expressive incompleteness: there are 
mathematically important properties of the natural numbers that cannot be expressed 
in the language of PA. We now see that PA suffers from a kind of proof-theoretic 
incompleteness: there are questions expressible in the language of PA that cannot be 
settled by a proof or refutation in PA. That is, there are cases where PA’s expressive 
resources are up to snuff, but its capacity to supply proofs is not. 

Here is another example of an undecidable sentence. There is a primitive recursive 
function s, represented in PA by a PA-formula a(x), that behaves as follows. 


1 if n does not codea proof in PA that 0 4 0 


a 0 otherwise. 
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Suppose PA is consistent. That is, suppose PA does not prove the absurdity ‘O 4 0’. 
Then no natural number codes a PA-proof of ‘0 4 0’ and, hence, the /7 ?-sentence 


s(a)=1 
is true. So PA proves each of the sentences 
a(0), o(S0), o(SS0), o(SSSO), ... 


and, hence, if it is logically possible for PA to have a standard model, PA does not 
disprove Vx o(x). It turns out that PA does prove 


(Vx o(x) > Vx (x)). 


So, if PA proved Vx a(x), it would also prove Vx y(x). We conclude: if it is logically 
possible for PA to have a standard model, then Vx o(x) is undecidable in PA. 

Suppose there is a PA-formula (x) that somehow expresses the idea that x codes 
a PA-proof of ‘O 4 0”. If v(x) is really doing a good job of expressing that idea 
inside PA, we would expect that, for each natural number n, if n does not code a 
PA-proof of ‘0 4 0’, then PA F — y(n) (that is, PA will refute the sentence asserting 
that n does do the coding). Suppose this is the case. 


Exercise 2.13 Prove that: PAF Wx — x(x) only if PA is inconsistent. (You might 
start by considering the relationship between the formula — x(x) and the function $.) 


Exercise 2.14 Suppose that, for each natural number n, PAt —x(n) only ifn does 
not code a PA-proof of ‘0 4 0’. Show that this implies the consistency of PA. 


Exercise 2.13 shows: if PA allowed us to say that no natural number codes a 
PA-proof of ‘0 4 0’, PA would not allow us to prove this unless it allowed us 
to prove everything. So we will not be able to use a PA-proof to demonstrate the 
consistency of PA. More generally, we will not be able to prove the consistency of 
PA using methods formalizable in PA. 

Note that a PA-proof of PA’s consistency would not be as pointless as it might 
first appear. The PA-proof would use only finitely many axioms of PA: we would be 
relying on only finitely many axioms to show that no combination of the infinitely 
many PA-axioms proves an absurdity. Those finitely many axioms might have had 
some property that made their consistency evident or, at least, more evident than 
the consistency of PA as a whole. So the PA-proof might have provided a non- 
circular reason for believing PA consistent. Alas, we now recognize that this is not 
to be. 
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2.5 Why Fret About Consistency? 


Hilbert thought that mathematicians frequently devote time and talent to proofs of 
sentences that have no content: sentences that may seem to state something, but really 
state nothing. He thought it important to justify this practice in some way. He recog- 
nized that consistency proofs could provide an especially powerful justification. Here 
is why. 

Suppose f is a primitive recursive characteristic function represented in PA by 
(x). If f(n) = 0, then PA + —(n). So if PA is consistent and PA + ¢(n), then 
f(™) = 1 (since, otherwise, PA would prove both ¢(m) and —¢(n)). So if PA is 
consistent and PAF Vx (x), then 


fO=1, fHM=1, fM=1, 


are all true and, indeed, the /7 t sentence 


f(a =1 


is true. If we are convinced that PA is consistent, we can feel free to use the machinery 
of PA to verify [7 sentences. We could think of PA as a trustworthy oracle. If the 
great oracle PA says that Vx (x) is true, then that settles it: it really is true that 
f(a) = 1. The same argument applies to any theory in which all the PA-axioms 
are derivable: if the theory is consistent, we can use it to verify IT 4 statements. 
Let your imagination run wild: as long as your fanciful tales are not fundamentally 
incoherent, we can use them to verify propositions such as Fermat’s Last Theorem 
and Goldbach’s Conjecture. What matters is consistency, not truth. 

If we have good reason to believe PA consistent, then we have good reason to 
use PA to prove theorems. If a theorem itself corresponds to a JT : sentence, we will 
verify that sentence. Otherwise, we will obtain results that (meaningless or not) may 
contribute to other proofs and, so, help us verify IT , sentences. Note that we need a 
good reason for believing PA consistent. A mathematical proof would certainly be a 
good reason. I note, though, that not every good reason in mathematics is supplied by 
a proof. (We might have very good reasons for adopting axioms, though axioms are 
statements we accept without proof.) GERHARD GENTZEN (1909-1945) did prove 
the consistency of PA using techniques not formalizable in PA.!° If you are already 
convinced that each axiom of PA is a true statement about the natural numbers, you 
may feel no need for such a proof. After all, a collection of true sentences cannot be 
inconsistent. Indeed, sentences that could all be true cannot be inconsistent. I suppose 
a reasonable person could, nonetheless, question the consistency of PA. However, 
the logician SOLOMON FEFERMAN reports that, among today’s mathematicians, “the 
number who doubt that PA is consistent is vanishingly small”.!! 


!0 See Gentzen [3]; English translation in Gentzen [4, pp. 132-213]. 
1! See Feferman [1, p. 192]. 
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As I already mentioned, the above reasoning applies to any theory that extends 
PA, any theory in which all the PA-axioms are derivable. If we have good reason 
to believe such a theory consistent, we have good reason to use the theory to prove 
theorems. In the next chapter, we begin our study of an important family of such 
theories: set theories. 


2.6 Solutions of Odd-Numbered Exercises 


2.1 No. Yes. Yes. No. No. Yes. 
2.3 Al says that, no matter what y is, (y + 0) = y. So, in particular, (0 + 0) = 0. 
Suppose, as an inductive hypothesis, that (0+ x) = x. Then S(0+ x) = Sx. A2 
says that, no matter what x and y are, (y + Sx) = S(y + x). So, in particular, 
(0+ Sx) = S(0+ x) and, hence, (0 + Sx) = Sx. We have shown 

(O+x«)=x > (04+ Sx) = Sx). 


Since we have not used any special information about x, x could be anything. That 
is, Vx((O+ x) =x — (0+ Sx) = Sx). One of our induction axioms assures us that 


(0+0)=0—> (Wx((0+x) =x > (04+ Sx) = Sx) — Vx (0+ x) =x)). 


SoVx(O+ x) =x. 


2.5 Suppose I” F 7. Then, by the completeness theorem, J” | 7. That is, there is a 
formal proof whose conclusion is 7 and whose premises are members of J”. Let the 
members of I’ be the finitely many members of I" that appear in the proof. Then 
I’ + wand, hence, by the soundness theorem, I’ F w. 


2.7 Suppose PA / 0 # 0. Then, by the completeness theorem, PA 7 0 4 0. That 
is, it is logically possible for an interpretation to make all the axioms of PA true and 
‘0 £ 0’ false. 


2.9 We introduce the “less than” symbol by stipulating that "a < (1 is an abbrevi- 
ation of '—Vz (a+ Sz) # (". (The idea is that x < y if and only if you can get to 
y by adding some non-zero number to x.) Let the members of C be the sentences 


0 <c, $0 <c, SSO <c, SSSO <c,... 


Let C* be a finite subset of C. Each member of C* is an inequality "a < c' where 
a is a PA-numeral. Since C* is finite we can pick a PA-numeral (7 longer than all 
those a’s. We are assuming that PA is consistent. So, by the completeness theorem 
and Exercise 2.7, it is logically possible for PA to have a model. Suppose we have 
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picked such a model. That model will assign an object to 3. Extend the model by 
assigning the same object to ‘c’. Our new model will make "c = (3" true. If our 
new model makes an inequality "a < c'in C* false, then (since our model thinks 
‘c’ and @ name the same thing) it will make the inequality "a < (3' false where a 
and (3 are PA-numerals, the latter longer than the former. We can show that this is 
impossible. (You might enjoy working out the details.) So, in fact, our new model 
makes all the members of C* true. More generally, if C* is any finite subset of C, it 
is logically possible for PA U C* to have a model. If every member of C were true, 
then c would exceed every finite value and, so, would deserve to be called infinitely 
large. Suppose PA rules out infinitely large numbers. Then the combination PA UC 
is incoherent and, so, can have no models. Since no models are possible, 


PAUCE0 £0. 


Use the compactness theorem to pick a finite subset of C such that 


PAUC*EK040. 


Then PAUC* cannot have a model—contrary to our earlier result. So we were wrong 
to suppose that PA rules out infinitely large numbers. 


211 Vx x =x. 
2.13 First note the following 


s(n) = 1 = > n does not code a proof in PA that 0 4 0 
=> PAF — x(n) 

s(n) = 0 => n codes a proof in PA that 0 4 0 
==> PA proves every sentence of PA 
=> PAF — — y(n). 


So —x(x) represents s in PA and, hence, by our earlier reasoning, if PA proved 
Vx — x(x), it would also prove Vx y(x). But PA does that only if it is inconsistent. 
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Chapter 3 
Hereditarily Finite Lists 


3.1 What Sets Are Not 


Professors really do mean well. If they are excited by the mathematical theory of sets, 
they will want others to be excited too. They will try to entice their students, to make 
their initial experience as pleasant as possible. They may say, “Sets are like...,” 
with the blank occupied by some reference to a non-threatening item of everyday 
experience. “Sets are like boxes.” “A set is like a flock of geese.” “A set is what you 
get when you bunch stuff together.” 

Professors do mean well. When they say, “Sets are like. ..,’ their intentions are 
pure and noble. They want to communicate something interesting and important. By 
putting students at ease, they hope to communicate it much more effectively. In the 
face of such good intentions, I should probably keep my mouth shut. But here it is: 
the students are being hoodwinked. I am confident there are no objects of everyday 
experience that closely resemble mathematical sets. The popular incantations, “Sets 
are like...’ may be psychologically and pedagogically beneficial, but they all fall 
apart under critical scrutiny. When treated as factual assertions, rather than poetry, 
they turn out to be as flimsy as wet paper.! Here is one example. 

Professor: Sets are like boxes. The things in a box are like the members of a set. 

Student: Here are two boxes. Everything in one is in the other, because, in fact, 
they are both empty. If sets are like boxes, then I guess different sets can have the 
same members. 

Professor: Well, no. A basic principle of set theory is that sets with the same 
members are the same. 

Student: So if I want to arrange for some boxes to behave like sets, I have to make 
sure that different ones have different contents. 

Professor: Yes. 


! For references and further discussion, see Pollard [11], Chap. 3. 


S. Pollard, A Mathematical Prelude to the Philosophy of Mathematics, 55 
DOI: 10.1007/978-3-3 19-05816-0_3, 
© Springer International Publishing Switzerland 2014 
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Student: What if I put my pen in this box? It’s the same box, but now it has 
something in it that wasn’t there before. If sets are like boxes, then I guess sets with 
different members can be the same. 

Professor: Certainly not. It’s a basic principle of mathematical logic that sets with 
different members are different. 

Student: So my boxes will behave like sets as long as I seal them all up and don’t 
let anyone shift things from one to another. 

Professor: Yes. 

Student: I put my pen in box A. Now I’m putting box A in box B. So now my pen 
is also in box B. If sets are like boxes, then I guess anything that’s a member of a 
member of a set z will itself be a member of z. 

Professor: Well, no. If x isa member of y and y is amember of z, it doesn’t follow 
that x is a member of z. 

Student: So the things in a box will be like the members of a set as long as we 
take “in” to mean something like “immediately inside of.” My pen is in box B, but is 
not immediately inside of it because there is another box between the pen and box B. 

Professor: That sounds right. 

Student: OK, I’ve sealed up all my boxes. I’ve made sure that different ones have 
different contents. I’m taking “in” to mean “immediately inside of.” But now I notice 
something: my pen cannot be immediately inside of more than one box. If sets are 
like boxes, then I guess nothing belongs to more than one set. 

Professor: No, no! We can prove that every set belongs to infinitely many sets. 

And so on. 

I am going to try my best not to hoodwink you. I am not going to say that 
mathematical sets are closely similar to any items of everyday experience. I know 
of no such items. I will insist that sets are, in several important respects, like certain 
things we can readily understand on the basis of everyday experience. The point 
is that experience with everyday objects can help us understand things that are not 
objects of everyday experience. We are going to discuss some things, known as 
“unranked list-types,” that are not physical objects.” You will not bump against them 
as you move through space. You cannot find one in your pocket or under a rock or 
even in a distant galaxy. Nonetheless, your experience of some of the things you 
do bump against (such as paper lists) will help you understand unranked list-types. 
Even if you end up a skeptic about non-physical beings, certain experiences will help 
you understand what unranked list-types would be like if they were to exist. This, in 
turn, will help you understand mathematical sets. So, yes, everyday experience can 
help you understand mathematical sets. I do not deny that, but I do insist that the 
path from physical realities to mathematical idealities is longer than some professors 
admit. 

To repeat: mathematical sets are, in several important respects, like certain 
things, unranked list-types, that we can readily understand on the basis of everyday 


? You already encountered the type/token distinction in Chap. 1. The relation between a single 
list-type and its many paper or pixel tokens is like the relation between a single numeral-type and 
the many numeral-tokens of that type. 
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experience. There are also important dissimilarities and I will be careful to point out 
various ways that unranked list-types are unlike sets. Our starting point is the com- 
monplace remark that you can identify a finite set by listing its members. If the list 
displays all the essential properties of the set, why not ignore the set and concentrate 
on the list?° 


3.2 List-Types 


Suppose I decide to list the three Beethoven piano sonatas I most admire. You decide 
to make a list of your own. I show you this: 


opus 14, #2 
opus 53 
opus 109 


You show me this: 


opus 109 
opus 53 
opus 14, #2 


As long as it is understood that our lists are unranked, it would be correct for one of 
us to say, “Hey, we have the same list.” Now the physical objects we have displayed, 
the paper lists, are clearly distinct. Mine is here; yours is over there. Yet we say we 
have the same list. So the one-and-the-same list we share is not a paper list. The 
philosophers would say it is a LIST- TYPE. This terminology should not be too hard 
to swallow since we ourselves might say our two paper lists are “of the same type.” 
That type or species itself is the list we share. From now on let us use the word ‘list’ 
to refer to unranked list-types. Remember: the one list we share, the list-type, is not 


3 This is a strategy mathematicians sometimes employ. For example, see Edwards [4], p. 139, for a 
treatment of multi-sets as lists. A multi-set is a set whose members can occur more than once. For 
a theory of ordered lists whose entries can occur more than once, see Deiser [3]. Such lists are not 
unusual. For instance, here is how the list of Wimbledon Gentlemen’s Singles champions begins. 


1. Spencer Gore 
2. Frank Hadow 
3. John Hartley 
4. John Hartley 


This is not a ranking. (A gentleman cannot outrank himself.) In mathematical parlance, it is a TUPLE 
(in particular, a 4-tuple). The lists we will be discussing in this chapter are not just unranked; they 
are not even ordered: they are not tuples. Among ranked lists there are some with ties and some 
without. Our lists are not of either sort. 
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to be identified with the marked-up piece of paper in your hand. Go ahead; tear up 
your paper if you want. You will not be tearing up our one shared list. 

This intangible, invisible list of ours may seem a bit mysterious. If we think about 
it too hard, we are likely to churn up all sorts of puzzles and get all sorts of mental 
cramps. None of this need get in the way of our current project. We do not even need 
to believe that such lists exist. I am not trying to establish the existence of anything. 
I am just trying to convey what it would be like if certain structures were to exist. 
All this requires is a basic understanding of some fairly ordinary talk about lists 
and a notion of what it would be like for that talk to be true when given a rather 
literal-minded interpretation. If the outcome is a grasp of the right structures, it will 
not matter if our understanding of lists is imperfect and gives us mental cramps when 
we probe it in certain ways. 

We are taking seriously the idea that your list is the same as mine. How did we 
figure out that our lists are the same? Well, we noticed that we listed the same things. 
There is a general principle at work here: lists (that is, unranked list-types) are the 
same if they list the same items. 

What are these items? We talk about things appearing on lists. This should not 
be confused with certain marks appearing (that is, being physically present) on a 
piece of paper. One could note, quite correctly, that the Waldstein sonata appears 
on my list. That is because the Waldstein sonata is Beethoven’s opus 53 and opus 
53 appears on my list. The words “Waldstein sonata’ do not appear on my list; no 
marks forming these words appear on my list; but the Waldstein sonata does appear. 
I have not listed phrases that refer to Beethoven sonatas (though I did write down 
such phrases). I have listed Beethoven sonatas. The “listing” relation holds between 
a list and the things listed. One could list phrases that refer to Beethoven sonatas. 


“Waldstein sonata’ 
“Beethoven’s opus’ 53 


‘Beethoven’s 21st piano sonata’ 


The point is simply that this is different from listing piano sonatas. Note that the list 


opus 14, #2 
Waldstein sonata 


opus 109 


and the list 
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opus 14, #2 
opus 53 
opus 109 


are the same because they list the same things. Here I am not announcing some 
profound discovery about the inner workings of the universe. I am just letting you 
know how I understand the expression “unranked list-type’. 

It is useful to have several ways of referring to the relationship between a list and 
an item listed. We say that y APPEARS ON x or, more briefly, x LISTS y. If x lists y, 
we also say that y is one of x’s ENTRIES. Though it is not standard English, I say that 
a list’s entries FORM that list. My three favorite Beethoven piano sonatas form a list: 
the very list we have been discussing. Some useful notation: we let {a, b,c, ...} be 
the list whose entries are a,b, c,.... Soa, b,c,... form {a, b,c, ...}. 

Lists can list lists. For example, I could list the first three lists discussed in 
this chapter. 


My list of three piano sonatas 
My list of three phrases 
My list of three lists 


This list has the odd property of listing itself. My first two lists do not list themselves. 
Could we list all the lists that do not list themselves? The mathematician ERNST 
ZERMELO (1871-1953) figured out that this is logically impossible.* 


Exercise 3.1 Show that it would be absurd for there to be a list that lists exactly 
those lists that do not list themselves. 


Zermelo’s discovery should be a warning to us: assertions about the existence of 
lists can far too easily land us in a contradiction. We should handle such assertions 
with care. 

Though it seems a bit of a stretch, we can imagine a list that lists nothing. I might 
undertake to list all the mass murderers I admire. When someone asks me how my 
list is coming along, I display a blank piece of paper. We could say there is no list of 
mass murderers I admire, since I admire no mass murderers. But it seems harmless 
to say there is such a list: a list with the odd property that it has no entries. (“There 
is nothing on your list’ is a perfectly acceptable sentence of English.) Since lists 
with no entries all have the same entries, they are all the same list: the one-and-only 
BLANK LIST. By displaying my blank piece of paper, I am identifying my list as the 
blank list. More notation: we let 4 be the blank list. 


* Kanamori [7], Landini [9], Rang and Thomas [12] give some of the history. 
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3.3 Precursors 


People can do things together that none of them can do alone. Fifteen FBI agents 
can surround a house. It is not even clear what it would mean for one of them to do 
so.° Lists behave the same way: they can have a property collectively that each of 
them lacks individually. For example, the first three lists I discussed in this chapter 
cooperate to form the third list I discussed. (That is, the first three lists are exactly the 
entries of the third list.) The first list does not, by itself, form the third list. Neither 
does the second nor even the third. (The third list is one of its own entries, but it is 
not all three of them. So it does not form itself.) The three lists have to work together 
to do this “forming.” In this respect, they are like three musicians who can play a 
trio together, though none of them can do it solo. 

Here is another example of a collective property. Consider the following lists: J, 
{H}, {{G}} (the blank list, the list whose only entry is the blank list, and the list whose 
only entry is the list whose only entry is the blank list). These lists are, collectively, 
ENTRY- CLOSED. That is, each of their entries is itself one of these very lists. The 
third list has one entry: {4}. This is the second of the three lists. You can encounter 
it without going beyond the three. The second list has one entry: J. This is the first 
of the three lists. Again, you do not have to venture beyond the three to encounter 
this entry. The first list has no entries. (So it is easy to confirm that each of its entries 
is one of the three lists.) We can always fabricate some entry-closed lists by starting 
with a list, identifying each of its entries, identifying each entry of those entries, and 
so on until we run out of entries. For example, start with {{@}}. This list has one 
entry: {4}. This list, in turn, has one entry: 4. At this point, we have run out of entries. 
So our procedure yields three lists 0, {G}, {{@}} that, as we saw, are entry-closed. 
Note that the two lists 0, {{@}} are not, taken together, entry-closed, since {@} is not 
one of them. The list J, just by itself, is entry-closed. 


Exercise 3.2 The one list {O, {{{O}}, {O}}} is not, by itself, entry-closed. Starting 
with this list, identify lists until your lists, taken together, are entry-closed. 


Now consider any lists that are, collectively, entry-closed. Suppose y is one of 
them. Then each of y’s entries is one of them. And each entry of an entry of y is 
one of them. And each entry of an entry of an entry of y is one of them. And so 
on. (More in a second about the phrase ‘and so on’.) Let us refer to these entries 
of entries of entries ... of entries of y as PRECURSORS of y. Precursors have the 
following properties. 


Proposition 3.1 If some lists are entry-closed and y is one of them, then each pre- 
cursor of y is one of them. 


Proposition 3.2 The precursors of any list are, collectively, entry-closed. 


5 In the folk song “Bad Man’s Blunder,” the bad man “got surrounded by a sheriff down in Mexico.” 
I understand this to be a slightly odd way of saying that the sheriff arranged for the bad man to be 
surrounded, not that the sheriff did the surrounding all by himself. 
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If Proposition 3.2 is not evident, just note that any entry of one of y’s precursors will 
itself be a precursor of y. 

A list’s precursors are its entries, the entries of its entries, the entries of the entries 
of its entries, and so on. This may seem to make sense. But is it really so clear what 
‘and so on’ means here? I actually think it is clear. Others may disagree. Clear or 
not, it does not really matter: we can avoid ‘and so on’ altogether by using a trick 
invented by Gottlob Frege.° The idea is to turn Proposition 3.1 into our definition of 
‘precursor’, in a way to be specified in a minute. A slight glitch: this definition will 
make each list a precursor of itself. Anyone who finds this offensive is encouraged 
to replace ‘precursor’ with a better term. (And, please, let me know what that term 
is.) A reason not to take offense: if we think of an entry of an entry of y as a second- 
order precursor of y and think of an entry of y as a first-order precursor of y, then 
a mathematician might find it natural to think of y itself as a zero-order precursor 
of y. 

Now for Frege’s definition. Suppose, no matter what entry-closed lists you might 
consider, x will appear among them if y does. We stipulate that such x’s are the 
“Frege-precursors” of y. More formally: 


Definition 3.1 x is a FREGE- PRECURSOR of y if and only if, given any entry-closed 
lists, if y is one of those lists, then so is x. 


The Frege-precursors of y are what tireless immortal beings encounter when they 
start with y and persist in tracking down the entries of everything they encounter. We 
now hasten to show that we could live without this new term ‘Frege-precursor’: a 
list’s Frege-precursors are exactly its precursors. We take this detour through Frege- 
precursors to confirm that we could introduce the term ‘precursor’ without using the 
phrase ‘and so on’. 

Proposition 3.1 implies that every precursor is a Frege-precursor. What about the 
converse? Could something be a Frege-precursor without being a precursor? Well, 
y is one of its own Frege-precursors. (Given any entry-closed lists, if y is one of 
them, then y is one of them.) But y will not necessarily be an entry of itself or an 
entry of an entry of itself or be embedded inside itself at all. So, as anticipated, the 
notion of Frege-precursor is a bit wider than our original notion of precursor. It does 
no harm, though, to expand the latter notion a tiny bit. A more important question: 
could something other than y be a Frege-precursor without being a precursor in our 
intended sense? That is not going to happen. For suppose x is a Frege-precursor of y. 
That is: 


Given any entry-closed lists, if y isone of them, then sois x. 


Proposition 3.2 assures us that y’s precursors are, collectively, entry-closed. So the 
phrase “given any’ allows us to pick these: the precursors of y. That is: 


If y is one of y’s precursors, then so is x. 


© Readers who like a challenge, might tackle Frege’s own exposition in Frege [5]; English translation 
in van Heijenoort [13], pp. 5-82. 
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We have decided to let y be one of its own precursors. So x is a precursor of y and, 
more generally, something is a Frege-precursor of y if and only ifit is a precursor of y. 
Apart from the slight glitch of making y one of its own precursors, Frege’s definition 
captures the notion we hoped it would capture. If anyone wants to know what we 
meant earlier when we said “and so on,” we can send them to Frege’s definition. 
From now on, we will use Definition 3.1 as our definition of ‘precursor’ and will 
drop the ‘Frege’ part. 


Exercise 3.3. Say that your ANCESTORS are your parents, your parents’ parents, 
your parents’ parents’ parents, and so on. Use Frege’s technique to explain what an 
ancestor is without using the phrase ‘and so on’. 


Exercise 3.4 Use Definition 3.1 to show that Y is the only precursor of 9. 


Exercise 3.5 Use Definition 3.1 to show that the precursors of y’s precursors are 
themselves precursors of y. 


Exercise 3.6 Let “the Y-lists” be y itself and the precursors of y’s entries. Show, 
first, that the Y-lists are entry-closed. Use this result and Definition 3.1 to show that 
each precursor of y is either y itself or a precursor of an entry of y. 


Exercise 3.7 Use Definition 3.1 to show that if y lists the entries of its entries, then 
the precursors of y are y itself and y’s entries. 


3.4 Logically Possible Lists 


We have already seen that it is easy to produce a description of a list that turns out 
to be incoherent (as with the list of all lists that do not list themselves). Speaking a 
bit carelessly: there are lists that could not be. Our notion of precursor will help us 
explore what lists there could be. I suggest we start out with a particular list and see 
what lists it allows us to list. Take the list {{@}}. Since it has only three precursors we 
can easily list them: %, {0}, {{0}}. If a list had 10!°° precursors, we human beings 
could not list them (at least, not in any sense of “list” I recognize). But this is because 
of our physical limitations. It does not seem logically impossible for the precursors 
of each list to be listed. Now when mathematicians have convinced themselves that 
something could be the case, they sometimes go ahead and assert that it is the case. 
This might be an expression of certainty. (“Not only could this be true; it definitely 
is true.”) On the other hand, it might just be the first step in seeing what interesting 
things follow from the proposition asserted. In the latter spirit, we shall entertain a 
proposition about what lists there are. 


3.4 Logically Possible Lists 63 


Proposition 3.3. The precursors of any list form a list. 


We hereby announce our intention to explore what the world would be like if such a 
thing were true. 

Before we get carried away, though, notice that our principle seems too modest. 
If there is nothing fundamentally incoherent about listing all the precursors of a list, 
there should be nothing fundamentally incoherent about listing some of them. There 
are 2° lists that list only precursors of {{@}}. One of them lists none. Three list one; 
another three list two. One, as we have already seen, lists all three. This suggests a 
slightly stronger principle. 


Proposition 3.4 Any precursors of a list form a list. 
That is, if x is a list anda, b,c, ... are all precursors of x, then {a, b,c, ...} is a list. 
Exercise 3.8 List all the lists that list only precursors of {{}}. 


You just listed all the lists of precursors of {{@}}. If you could list all of them, 
surely you could have listed some of them. Without bothering to identify all 2° lists 
of lists of precursors of {{%}}, let us go ahead and assert the corresponding principle. 


Proposition 3.5 Any lists of precursors of a list form a list. 


That is, if x is a list and a, b,c, ... list only precursors of x, then {a,b,c,...} isa 
list. The idea behind this principle comes from the philosopher DAVID BENNETT.’ 

Should we now assert that any lists of lists of precursors of a list form a list and 
that any lists of lists of lists of precursors of a list form a list and so on forever? Not 
to worry: that will not be necessary. Our principles already imply all the propositions 
of this type we are going to need. Let us verify this. Here are the axioms we are going 
to use. 


Axiom 3.1 Lists are the same if they have the same entries. 
Axiom 3.2 There is a list with no entries. 
Axiom 3.3 Any lists of precursors of a list form a list. 


For the rest of this chapter, we will identify interesting things that follow from 
these axioms. Some new vocabulary will be useful. 


Definition 3.2 A list is PURE if and only if all its precursors are lists. 


The list {4, {Frege}} is not pure because Frege is one of its precursors and (presum- 
ably) Frege is not a list. Your eight lists of precursors of {{@}} are all pure. Precursors 
of a pure list are themselves lists of precursors of that pure list. (Each of them is a 
list and each of their entries is a precursor of the pure list.) So Axiom 3.3 yields the 
following theorem. 


7 See Bennett [2] (available via http://projecteuclid.org). 
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Theorem 3.1 Any precursors of a pure list form a list. 


More narrowly: 


Theorem 3.2 Any entries of a pure list form a list. 


Definition 3.3. A PART of a list y is any list of all, some, or none of y’s entries. (That 
is, x is a part of list y if and only if x is a list all of whose entries are entries of y.) 
A PROPER PART of y is any part of y other than y itself. 


Note that each list is a part of itself (though not a proper part) and that the blank 
list is a part of every list. (J is a part of list y because YJ has no entries that are not 
entries of y.) Since each entry is a precursor, each part is a list of precursors. So 
Axiom 3.3 implies: 


Theorem 3.3 Any parts of a list form a list. 
In particular, all of a list’s parts form a list. 
Definition 3.4 Py is the list of y’s parts. 
Exercise 3.9 Show that Py is pure if y is. (Remember Exercises 3.5 and 3.6.) 


It is a special case of Theorem 3.1 that all of a pure list’s precursors, acting in 
concert, form a list. 


Definition 3.5 If y is pure, then Sy is the list of y’s precursors. 
Exercise 3.10 Show that Sy is pure if y is. (Remember Exercises 3.5 and 3.6.) 


Suppose y is pure. PSy is the list of all lists of y’s precursors and PPSy is the list 
of all lists of lists of y’s precursors. Any entries of PPSy form a list. So any lists of 
lists of precursors of a pure list form a list. We could repeat this argument as many 
times as we want. So, as long as we concentrate on pure lists, we’ll have all the lists 
of lists of lists ... of lists of precursors we want. 


Exercise 3.11 Show that Sy # @. 


Exercise 3.12 Show that Px is a proper part of Py only if x is a proper part of y. 


3.5 Numbers Can Be Lists 


It seems likely that Axioms 3.1—3.3 are not logically absurd. It seems likely that they 
are consistent. The logician WILHELM ACKERMANN (1896-1962) figured out how 
to prove they are consistent.* Natural numbers can have a lot of internal structure 
that can be revealed in various ways: by taking the prime factorization, for example. 


8 See Ackermann [1]. For a more recent discussion, see Kaye and Wong [8] (available via http:// 
projecteuclid.org). 


3.5 Numbers Can Be Lists 65 


This internal structure can be used to store information. Ackermann appreciated 
something that now may seem obvious: we can store a lot of information in natural 
numbers by exploiting their binary representation. 

Suppose you are going to meet with Julia, Martin, and Yuri and need to know 
which of them are spies. I send you a very short message: “6.” We have already 
agreed that you are to convert my message into binary, writing the digits under the 
three names taken in alphabetical order. Here is what you get.? 


Julia Martin Yuri 
1 1 0 


We agreed that 1 means yes and 0 means no. So my message says that Julia and 
Martin are spies, but Yuri is not. That is, 6 stands for the list {Julia, Martin}. 

Of course, numbers do not have to be lists of spies. They could be lists of lists. 
What list of lists is 6? To find out, first represent 6 in binary: 110. Since we have only 
three digits, 6 is going to list at most three lists. Our candidates are going to be the 
lists represented by the numbers 0, 1, and 2. (Yes, I realize we do not yet know what 
these lists are! Be patient. We will figure that out in a minute.) We do our decoding 
much as we did with the spies, though we now write down our candidate entries in 
descending numerical order. 


List2 List1 ListO 
1 1 0 


Again, 1 means yes and 0 means no. So 6 says yes to | and 2, but no to 0. 


List2 List1 List0O 
1 1 0 
Yes! Yes! No! 


That is, 6 = {1, 2}. What are | and 2? A number with n binary digits passes judgment 
on lists 0 through n — 1. So 1, with its one binary digit, passes judgment only on the 


first of our lists. 
List 0 


1 
Yes! 


1 says yes to 0. So 1 = {0}. What is 0? That is easy: 0 is just one big no. It says no 
to the one list on which it passes judgment. 


List 0 
0 
No! 


° Recall that 6 is 110 in binary notation because 6 = 1-27 + 1-2! 40-29. 
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Since 0 say no to 0, it lists nothing at all: 0 = J. Since | says yes to 0 and to nothing 
else, 0 is the one entry of 1. That is, 1 = {A}. What is 2? 2 is 10 in binary: yes to list 
1 (that we have identified as {0}), no to list 0 (that we have identified as ¥). 


{B} i) 
1 0 
Yes! No! 


So {@} is the one entry of 2. That is, 2 = {{@}}. Now, finally, we know what 6 is. 
Since 6 is 110 in binary, it says yes to list 2 (that we have identified as {{@}}), yes to 
list 1 (that we have identified as {%}), and no to list 0 (that we have identified as @). 


{{O}} {#} B 
1 1 0 
Yes! Yes! No! 


So the two entries of 6 are {{@}} and {0}. That is, 6 = {{{G}}, {O}}. 
Here is one more example. We shall do all our figuring in binary. What is 1010? 
Our decoding key is the following. 


List11 List10 List1 List0 
1 0 1 0 


We already know that 10 = {{G}}, 1 = {0}, and 0 = J. What is 11? 11 says yes to 
lists 1 and 0. 


{O} i) 
1 1 
Yes! Yes! 


So 11 = {{}, B}. This yields the following decoding scheme. 


{{0}, 9} {{O}} {0} Do 
1 0 1 0 
Yes! No! Yes! No! 


So 1010 = {{{A}, B}, {O}}. 
Exercise 3.13 Decode the number 18. 


Ackermann’s coding trick allows us to interpret claims about lists as claims about 
natural numbers. Axiom 3.1 says that numbers with the same binary representation 
are the same. Axiom 3.2 says there is a number with no |’s in its binary representation. 
Axiom 3.3 says something a bit obscure; but it too turns out to be true. 

In the Ackermann interpretation, “lists” are just natural numbers. If such a list k is 
of the form 2” 4:2" 4-2° 4.2: (where a, b, c, ... are distinct natural numbers), then 
its “entries” are the exponents a, b, c, ..., with each of these representing a list. If 
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a=27 4274284... 


then a), a2, a3, ... are all entries of a and, hence, are all entries of entries of k. A 
similar analysis of, say, a; could reveal entries of entries of entries of k. The rather 
bland idea behind the Ackermann reading of Axiom 3.3 is that this process cannot 
go on forever. For example, consider the “list” 9. We dissect this as follows. 


9 = 23 + 29 
1 
3 =2! +29 
+ 
1 =2° 


The idea is to provide a binary representation of each non-zero exponent until we 
run out of such exponents. Since k < 2*, the exponents get smaller at each step 
and, so, we have to run out of non-zero ones eventually. The “precursors” of 9 are 
9 itself and the exponents that appear in this analysis. They are: 9, 3, 1, 0. If we 
want to list some of these precursors, there is nothing to stop us. For example, since 
515 = 2942! 4+ 29 it “lists” 9, 1, and 0. Similarly, there is nothing to stop us from 
listing any of these lists of precursors. So any lists of precursors of a list form a list. 


Exercise 3.14 Identify the precursors of 12. Identify a number that lists some lists 
of 12’s precursors. (There are 4, 294, 967, 296 such numbers.) 


Axioms 3.1—3.3 could all be true because they are all true under the Ackermann 
interpretation. Logical absurdities are propositions that cannot be true. So the system 
consisting of Axioms 3.1—3.3 is not logically absurd. Or, to state the matter more 
carefully, our system is absurd only if arithmetic is absurd. The consistency of our 
system is at least as certain as the consistency of arithmetic. A reasonable person 
could consider this a proof of the consistency of our system. 


3.6 Lists Can Be Numbers 


We now have some idea of how our theory of lists could be developed inside number 
theory. We can also develop number theory inside our theory of lists. Recall that Sx 
is the list of x’s precursors. We now introduce some new terminology. 


Definition 3.6 Some lists are, collectively, S-CLOSED if and only if they satisfy the 
following condition: if a list x is one of them, then x’s precursors form a list and that 
list too is one of them. 


Definition 3.7 A list x is a NUMBER if and only if it satisfies the following condition: 
given any S-closed lists, if 4 is one of them, then so is x. 
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Numbers are what tireless immortal beings encounter when they start with @ and 
persist in applying the operation S to everything they encounter. 


Exercise 3.15 Prove that @ is a number. 
Our definition of number immediately yields an induction principle. 


Theorem 3.4 /f some lists are S-closed and % is one of them, then every number is 
one of them. 


In an ordinary inductive proof, we first show that 0 has a certain property. We then 
suppose that an arbitrary number 7 has the property and use this information to show 
that n + 1 also has the property. This establishes that every natural number has the 
property. Theorem 3.4 justifies a similar technique. First, show that J has a certain 
property. Second, assume that an arbitrary number v has the property and use this 
information to show that Sn also has the property. Third, conclude that every number 
has the property. We use this form of induction to prove the next two theorems. 


Theorem 3.5 [fn is a number, then Sn exists and is itself a number. 


Proof If x is pure, then, by Theorem 3.1, Sx exists and, by Exercise 3.10, it too is 
pure. So the pure lists are, collectively, S-closed. Furthermore, by Exercise 3.4, @ is 
pure. So, by Theorem 3.4, every number is pure (since Y is pure and purity is always 
transmitted from x to $x). Suppose n is a number. Then, by Theorem 3.1, Sn exists. 
Since n is a number, you will find it among some S-closed lists whenever you find 
% among them. So you will find Sn among some S-closed lists whenever you find 4 
among them. That is, Sn is a number. Oo 


Theorem 3.6 Each number lists the entries of its entries. 


Proof 9 lists all the entries of its entries for the same reason it lists all the even primes 
greater than 2: there are no such things. Suppose n lists the entries of its entries. Then, 
according to Exercise 3.7, the precursors of n are n itself and n’s entries. Suppose 
Sn lists k and k lists 7. Then k is either n itself or an entry of n. In either case, 7 is a 
precursor of 7 and, so, is listed by Sn. So Sn lists the entries of its entries. We have 
shown that ¥ satisfies the theorem and that Sn does whenever n does. Now we just 
apply Theorem 3.4. Oo 


Exercise 3.7 now yields the following. 


Theorem 3.7 [fn is anumber, then n and its entries form Sn (that is, the precursors 
of n are n itself and the entries of n).'° 


Exercise 3.16 Show that no number lists itself. (You might try induction.) 


Exercise 3.17 Show that ifm and n are numbers and Sm = Sn, thenm = n. 


10 Tn set theoretic notation: Sn =n U {n}. 
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Exercise 3.18 Show that ifm and n are numbers and n lists m, then Sn lists Sm. 
(You might try induction.) 


Exercise 3.19 Show that every number other than G lists 6. (You might try induction.) 


% is playing the role of 0 here. SJ is 1. SSY is 2. And so on. Each number is the 
list of prior numbers. 


0= 1) 
1= {0} 
2= {0,1} 
3 = {0, 1, 2} 
That is, Sn = {0,...,}. The mathematician DIMITRY MIRIMANOFF (1861 — 1945) 


invented this beautiful technique for constructing objects that behave like the natural 
numbers.!! On this basis, we can reproduce arithmetic using nothing but lists. 

Since each number lists all smaller numbers and is listed by all larger ones, 
“number j lists number k” is equivalent to “k is smaller than j” and “7 is larger than 
k.” We have already shown that, as long as we are talking about numbers, the “listing” 
(or “larger than”) relation is TRANSITIVE (Theorem 3.6) and IRREFLEXIVE (Exercise 
3.16). We now show that it has two other important properties. 


Theorem 3.8 WELL-FOUNDEDNESS: Given any numbers at all, one will list none of 
the others. 


Proof Say that some numbers are, collectively, FRIENDLY if each of them lists at 
least one of the others. Suppose you are presented with some friendly numbers. Say 
that a number is UNFRIENDLY if it lists none of those friendly numbers. Since @ lists 
nothing, it is unfriendly. Suppose n is unfriendly. If one of n’s entries listed one of 
the friendly numbers, then, by Theorem 3.6, n would do so too. So n’s entries are 
all unfriendly. But then, by Theorem 3.7, Sv lists none of the friendly numbers and, 
so, is itself unfriendly. We conclude, by induction, that there really are no friendly 
numbers. Oo 


Theorem 3.9 CONNECTEDNESS: Given any two numbers, one will list the other. 


Proof Say that two numbers are CONNECTED if and only if one of them lists the other. 
In Exercise 3.19, you showed that Y is connected with every other number. Suppose 
n is so as well. Let k be a number distinct from and unconnected with Sn. k either 
lists or is listed by n. It cannot be the latter, since k would then be listed by Sn. So 
k lists n and, hence, (as you showed in Exercise 3.18) Sk lists Sn. This means that 
Sn is either k (contrary to our assumption that they are distinct) or an entry of k 
(contrary to our assumption that they are unconnected). We conclude, by induction, 
that every number is connected with every other number. Oo 


'l See Hallett [6], pp. 185-194, and Mirimanoff [10]. 
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Since the relation of listing, when applied to numbers, is transitive, irreflexive, 
well-founded, and connected, it is said to WELL-ORDER the numbers. 


Exercise 3.20 Show that ifm and n are numbers, then there is a list m\n that lists, 
exactly, the numbers that m lists but n does not. What is 13\9? What is 9\ 13? 


3.7 Super-Numbers 


Recall that if x is a list, then Px is the list of x’s parts. 


Definition 3.8 Some lists are, collectively, P-CLOSED if and only if the list Px is 
one of them whenever the list x is. 


Definition 3.9 A list x is a SUPER- NUMBER if and only if it satisfies the following 
condition: given any P-closed lists, if 4 is one of them, then so is x. 


Super-numbers are what tireless immortal beings encounter when they start with 4 
and persist in applying the operation P to everything they encounter. The first few 
super-numbers are 0, P@, PP, PPP@. That is: 


D 
{0} 
{0, {B}} 
{0, {9}, {{O}}, {D, {933} 


Theorem 3.10 @ is a super-number. 
Theorem 3.11 [fx is a super-number, then so is Px. 
Super-numbers behave very much like numbers (hence the name).!” 
Theorem 3.12 Py+ &. 
Exercise 3.21 Show that Px=Py only if x = y. 


Theorem 3.13 /f some lists are P-closed and 9 is one of them, then every super- 
number is one of them. 


Note that the last theorem is a kind of induction principle. You can show that 
every super-number has a certain property by showing that # has the property and 
that each super-number x passes the property on to its “successor” Px. 


2 T should warn you, though, that the term ‘super-number’ is not standard. Do not expect anyone 
else to refer to these objects in this way. 
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Exercise 3.22 Show that every super-number is pure. 
Exercise 3.23 Show that every super-number lists the parts of its entries. 
Exercise 3.24 Show that if super-number x lists y, then Px lists Py. 


Theorem 3.14 If x and y are super-numbers and x is a proper part of y, then x is 
an entry of y. 


Proof The theorem comes out true when x = ¥# because every super-number other 
than @ lists 6. (Every super-number lists the parts of its entries and @ is a part of every 
entry.) Suppose, as an inductive hypothesis, that the super-number x is an entry of a 
super-number whenever it is a proper part. Suppose Px is a proper part of the super- 
number y. Then y 4 @ and we can show that y=Py’ for some super-number y’. (Go 
ahead and prove it if you do not already believe it.) So Px is a proper part of Py’ and, 
as you showed in Exercise 3.12, x is a proper part of y’. Our inductive hypothesis 
now guarantees that x is an entry of y’. So, by Exercise 3.23, y’ lists all the parts of 
x and, hence, Px is a part of y’. That is, Px is an entry of Py’, as desired. We have 
shown that satisfies the theorem and that the super-numbers satisfying the theorem 
are, collectively, P-closed. So every super-number satisfies the theorem. You will 
probably find this method of induction useful. (Perhaps you already found it useful 
when you did Exercises 3.22 and 3.23.) oO 


Exercise 3.25 Show that every super-number lists the entries of its entries (so every 
entry of a super-number x is a part of x). 


Exercise 3.7 now allows us to prove the following. 


Theorem 3.15 The precursors of a super-number x are x itself and x’s entries. 


Exercise 3.26 Show that, given any two super-numbers, one will list the other. (As 
in the proof of Theorem 3.9, say that two super-numbers are CONNECTED if and only 
if one of them lists the other. As an inductive hypothesis, suppose x is a super-number 
connected with every other super-number. Let y be a super-number distinct from Px. 
Note that either x lists y or y lists x. To confirm that y and Px are connected, consider 
each of these cases.) 


Definition 3.10 A list is WELL-FOUNDED if and only if, given any of its precursors, 
at least one of them lists none of them. 


Earlier in this chapter, we considered a list that may have struck you as odd. Here it 
is again. 


My list of three piano sonatas 
My list of three phrases 
My list of three lists 
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My list of three lists was this very list. If called upon to pick some precursors of this 
list, we could, if we wished, pick the list itself and then decline to pick anything else. 
Then all of the precursors we have picked (all one of them) list a precursor we have 
picked. So the above list is not well-founded: we can find an endless path through its 
precursors that takes us from a precursor (my lists of three lists) to an entry of that 
precursor (my lists of three lists) to an entry of that precursor (my lists of three lists) 
and so on forever. Quite generally, lists that list themselves 


XOX 
are not well-founded. Neither are lists listed by something they list 
yrzwy. 
Nor is a list with infinitely many precursors w1, w2, w3, ... each listing the next 
Wl Ww. W3~weeee 


In each case, we have precursors each of which lists at least one of those very 
precursors. (x lists x. y lists z, while z lists y. wy lists wy+1.) 


Theorem 3.16 Every super-number is well-founded 


Proof The only precursor of @ (itself) lists nothing. So @ is well-founded. Suppose 
x is a well-founded super-number. Let y;, y2, y3, ... be precursors of Px that violate 
well-foundedness. (That is, each y, lists some y;.) Theorems 3.11 and 3.15 guarantee 
that each y; is either Px itself or an entry of Px. Each case yields a y; that is an entry 
of x. (Suppose, first, that y, is an entry of Px. Let y; be an entry of yy. Then y; is an 
entry of x since it is an entry of a part of x. Now suppose y; = Px. Let y; be an entry of 
Ye. Yi is a part of x. Pick a y; listed by y;. y; is an entry of x.) Let w1, w2, w3,... be 
the y;’s listed by x. Pick a wg. Since wz appears among yj, y2, y3, ... and these lists 
violate well-foundedness, we can pick a y; listed by wz. By Exercise 3.25, x lists y; 


and, hence, y; appears among w1, W2, W3, .... More generally, each w, lists some 
y; that appears among w}, W2, w3,.... That is, each of the lists w1, w2, w3, ... lists 
at least one of the lists w1, w2, w3,.... Since w1, w2, W3,... are all precursors of x, 


this contradicts the well-foundedness of x. So we must have been wrong to deny that 
Px is well-founded. Since Y is well-founded and the well-founded super-numbers 
are, collectively, P-closed, every super-number is well-founded. Oo 


3.8 Hereditary Finiteness 


A finite list is a list with finitely many entries. Hereditarily finite lists are finite “all 
the way down.” Their entries have finitely many entries, as do the entries of their 
entries, the entries of the entries of their entries, and so on. The pure, well-founded, 
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hereditarily finite lists are the lists coded by numbers in the Ackermann interpretation. 
They are also the lists listed by super-numbers. Now “pure, well-founded, hereditarily 
finite” is quite a mouthful. We will follow set theoretic practice and shorten it to just 
“hereditarily finite” even though the odd list we discussed in the previous section 
is hereditarily finite without being either pure or well-founded. Whenever you see 
“hereditarily finite,’ try to remember that there are invisible qualifications: “pure and 
well-founded.” 


Definition 3.11 A list is hereditarily finite if and only if it is listed by a super-number. 


Exercise 3.27 Show that each number is hereditarily finite. (After confirming that 
a super-number lists 0, you might assume, as an inductive hypothesis, that a super- 
number x lists the number n. Theorem 3.7 might help you show that Sn is a part of 
x and, hence, that Px lists Sn.) 


Definition 3.12 If x is pure, then () x is the list of the entries of x’s entries. 


It may not be obvious why Definition 3.12 requires that x be pure (that is, that 
x’s precursors all be lists). Consider the impure list {{Frege}, {Hilbert}}. This list has 
two entries: {Frege} and {Hilbert}. The first of these entries has a single entry: Frege. 
The second also has a single entry: Hilbert. So Frege and Hilbert are the entries of 
the entries of {{Frege}, {Hilbert}}. Now there is nothing to prevent Frege and Hilbert 
from forming a list: {Frege, Hilbert}. This would be the list _){{Frege}, {Hilbert}} or, 
in notation that may be more familiar, {Frege} U {Hilbert}. Remember, though, what 
our current project is. We are working inside an axiomatic system. We are trying to 
see what follows from Axioms 3.1—3.3. No matter how obvious it may seem that a 
certain list exists, we cannot accept that it exists unless our axioms confirm this. The 
relevant axiom here is 3.3: 


Any lists of precursors of a list form a list. 


Frege and Hilbert are, indeed, precursors of {{Frege}, {Hilbert}}, but they are not 
lists of precursors because (presumably) they are not lists. So Axiom 3.3 does not 
apply. This is why Theorem 3.1 


Any precursors of a pure list form a list 


includes the qualification “pure.” Consider, for example, the pure list {{@}, {{@}}}. It 
has two entries: {@} and {{@}}. The first of these entries has a single entry: @. The 
second also has a single entry: {4}. So @ and {@} are the entries of the entries of 
{{G}, {{0}}}. Furthermore, they are lists of precursors of {{@}, {{G}}} and, so, Axiom 
3.3 assures us that they form a list. That list, {J, {H}}, is U{{A}, {{G}}} or, in slightly 
different notation, {4}U{{@}}. More generally, if a list is pure, then Axiom 3.3 assures 
us that the entries of its entries really do form a list. If a list is impure, we receive no 
such assurance. 
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Exercise 3.28 Show that if x is hereditarily finite, then so is \) x. (You might assume 
that a super-number y lists x and then try to show that \) x is a part of y.) 


Exercise 3.29 Show that if x is hereditarily finite, then so is Px. 


Exercise 3.30 Show that if x and y are hereditarily finite, then {x, y} exists and is 
hereditarily finite. (Exercise 3.26 might be useful.) 


Exercise 3.31 Show that if some lists are all entries of a hereditarily finite list, then 
they form a hereditarily finite list. 


Exercise 3.32 Show that hereditarily finite lists with the same hereditarily finite 
entries are the same. 


Exercises 3.28—3.32 show that five of the axioms from the theory Z, Zermelo’s 
classic axiomatization of set theory, can be interpreted as statements about heredi- 
tarily finite lists: just read “set” as “hereditarily finite list” and “is an element of” as 
“is an entry of.”!? 

The following theorem will be useful in Chaps. 4 and 5. 


Theorem 3.17 No hereditarily finite list lists every number. 


Proof By Exercise 3.25, if some hereditarily finite list listed every number, some 
super-number would do so as well. We will use Theorem 3.13 (super-number induc- 
tion) to show that this is impossible. Note, first, that @ omits every number. Suppose 
x omits the number n. Then Sv is not a part of x since Sn lists n. So Px omits Sn. 


oO 


3.9 Finite Ranks 


Here is a fairy tale. In spite of their name, the Listers have never done any listing. 
Now they are going to make lists galore. They proceed systematically, in a series 
of stages. At each stage, they list lists from earlier stages. They are careful not to 
reproduce earlier lists; but they are also careful not to miss any opportunity to make 
new lists. The Listers will list any earlier lists as long as the result is a new list. 

The process begins at stage 0. Since the Listers have never listed, their first list of 
earlier lists is blank. 


Stage 0 yields %. 


At the next stage, stage 1, the Listers list the one and only earlier list. Their one new 
list is the list whose only entry is the blank list. 


'3 See Zermelo [14]; English translation in van Heijenoort [13], pp. 199-215. 
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Stage 1 yields {D}. 


When they reach stage 2, the Listers have already made two lists: # and {¥}.'* They 
obtain one new list by listing both of these lists. They obtain another new list by 
listing {9} all by itself. They do not list J all by itself because they already did that 
at stage 1. 


Stage 2 yields {{O}} and {G, {O}}. 


They now proceed to stages 3, 4, 5 and beyond. 


Exercise 3.33 Define a function that gives the number of lists appearing at stage n. 
(You can use recursion if you want.) Do you think the Listers can complete stage 4? 
Do you think they can really complete stage 5? 


If a list appears at stage n in the Listers’ listing process, we say that its RANK is 
n. @ has rank 0. {@} has rank 1. {{@}} and {@, {0}} have rank 2. A graph may help to 
make the notion of rank more vivid. Here are the Listers’ first four lists with arrows 
representing the relation “is an entry of.” 


{0, {0}} 


ie 
0 
~ 


Note that the rank of list x is the number of arrows in the longest path from G to x. 
In the Listers’ listing process, a list appears just after all its entries have appeared. 
Each list has to wait for the appearance of its latest arriving entries. When we count 
arrows in the /ongest path from § to x, we are letting the ranks of x’s latest arriving 
entries determine the rank of x. 


{9} > {{0}) 


Exercise 3.34 Determine the rank of {{®}, {{{@}}}}. Determine the rank of 


{CHEE BE 


How does the bracket notation “ {...}” help us determine ranks? 


Here is yet another way to think about ranks. In a moment, I will introduce 
some lists 


'4 OK, Lam being sloppy here. Stated more precisely, the Listers have made tokens of the two types 
% and {4%}. We do not need to imagine that they make the types. At each stage, they list types that 
have tokens produced at earlier stages. 
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V(0), VQ), V(Q),... 
that will turn out to be our old friends the super-numbers 
0, PY, PPO,.... 


Here are V(0), V(1), V(2), V(3). 
i) 


{0} 
{0, {B}} 
{0, {9}, {{D}}, (9, {933} 


Although V (0) lists nothing, each subsequent V (72) lists the lists of rank less than n. 
V (1) lists the one list of rank 0. V(2) lists the one list of rank 0 and the one list of 
rank 1. V (3) lists the one list of rank 0, the one list of rank 1, and the two lists of rank 
2. Lists that first appear on list V(n + 1) have rank n. A list appears on V(n + 1) if 
and only if it is part of V(). So a list of rank n will be a part of V(7), but will not 
be a part of V(k) for any k less than n. This will motivate the definition of rank that 
appears below. 

Though I will not give the details, we can use induction and some other tricks to 
show that if n is a number, then there is a unique sequence of lists ag, a1, ... ,@, with 


the following properties.!* 
a= 
ask = Pag 
If we identify V(m) with the last term a, in the sequence do, a1,...,dn, We can 


confirm that the lists V(0), V(1), ... have the following properties. 
V(0) =@ 
V(Sk) = PV(k) 


Assume that this is so. (Or verify it yourself, if you want.) You can now show that 
the lists V(n) are, in fact, our super-numbers. 


Exercise 3.35 Use induction to show that if n is a number, then V(n) is a super- 
number. 


'5 Among the details I am skipping over is the question of whether Axioms 3.1—3.3 allow us to 
introduce the notion of finite sequence. You might find it interesting to show that they do. 
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Exercise 3.36 Use Theorem 3.13 (i.e., super-number induction) to show that if x is 
a super-number, then x = V(n) for some number n. 


Now that you have confirmed that the lists V(0), V(1), V(2),... are the 
super-numbers #, PJ, PP%,..., you can use facts about super-numbers to obtain 
results about the V(7)’s. For example, you might want to use Exercise 3.25 to com- 
plete the following exercise. 


Exercise 3.37 Use induction (on n) to show that if n lists m, then V (n) lists V(m). 
Theorem 3.18 Jf V(m) = V(n), thenm =n. 


Proof Suppose V(m) = V(n). According to Theorem 3.9, if m and n are distinct, 
then one lists the other. Suppose n lists m. Then, by the result you just established, 
V(n) lists V(m) and, hence, V(n) lists V(n). Since this contradicts Theorem 3.16, 
we conclude that m and n are not distinct. Oo 


Exercise 3.38 Show that if V(n) lists V(m), then n lists m. (You might find Theorem 
3.16 useful.) 


Since the hereditarily finite lists are the entries of the super-numbers, every hered- 
itarily finite list is an entry of V (7) for some number n. By Exercise 3.25, every entry 
of V(n) is a part of V(n). Theorem 3.8 assures us that if x is a part of V (7) for some 
number n, then there is a smallest such n. So the following definition is justified. 


Definition 3.13 If x is hereditarily finite, then p(x) (the RANK of x) is the first 
number 7 such that x is a part of V(n). 


For the rest of this section, we will assume that x, y, z are hereditarily finite lists. 


Exercise 3.39 Use induction on n to show that if V(n) lists x, then n lists p(x). 
(Note that if x is a part of V(n), then p(x) is no greater than n: that is, either n lists 
p(x) orn = p(x).) 


Definition 3.14 List y OUTRANKS list x if and only if p(y) lists p(x). 
Exercise 3.40 Show that every hereditarily finite list outranks its entries. 


Exercise 3.41 Show that if z outranks each entry of y, then y does not outrank z. 
(You might find Exercise 3.23 useful.) 


Exercise 3.41 says that each hereditarily finite list has the lowest rank compatible 
with Exercise 3.40. 


Exercise 3.42 Show that if y does not outrank z, then z outranks each entry of y. 
(Note: since our ranks are numbers, they obey Theorem 3.9.) 
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Theorem 3.19 One hereditarily finite list outranks another if and only if some hered- 
itarily finite entry of the former outranks each hereditarily finite entry of the latter. 


Proof According to Exercises 3.41 and 3.42, y fails to outrank z if and only if z 
outranks each entry of y. This is equivalent to: 


youtranksz <=> 2fails to outrank some entry of y. 


Applying Exercises 3.41 and 3.42 again, z fails to outrank a list x if and only if x 
outranks each entry of z. So z fails to outrank some entry of y if and only if some 
entry of y outranks each entry of z. Our grand conclusion: 


youtranks z <= > some entry of y outranks each entry of z. 


Exercise 3.25 guarantees that each entry of a hereditarily finite list is hereditarily 
finite. So, given our standing assumption that y is hereditarily finite, “some entry of 
y” is necessarily “some hereditarily finite entry of y.” Oo 


Definition 3.15 Some hereditarily finite lists are, collectively, of BOUNDED RANK 
if and only if some hereditarily finite list outranks all of them. 


Exercise 3.43 Show that any hereditarily finite lists of bounded rank will form a 
hereditarily finite list. 


3.10 And Why Are We Doing This? 


Although we are marching under the banner of list theory, this is just a way of talking 
about set theory. So you have already received a large dose of set theory. You will 
receive two more doses in Chaps. 4 and 5. Perhaps, then, I should pause and reassure 
you that this treatment really is warranted. 

This book is supposed to provide a mathematical prelude to the philosophy of 
mathematics. It is supposed to help you understand philosophical work that has 
already been done and is also meant to prepare you to do philosophical work of your 
own. A big dose of set theory should help you both understand and do philosophy 
of mathematics. First of all, many philosophers of note have a lot to say about set 
theory. You will want to understand what they are saying. Secondly, if you are to have 
any chance of saying something intelligent about contemporary mathematics, you 
will need to have a good idea of what contemporary mathematics is. No whirlwind 
tour of the contemporary scene is better than the one provided by set theory. Even 
if you become a professional mathematician, all the mathematics you are likely to 
encounter or produce will be reducible to the set theories presented in Chaps. 3, 4 
and 5 or to some natural extension of those set theories. Granted, these three chapters 
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will not magically introduce you to all the specialized vocabularies, techniques, and 
results of all the sub-fields of mathematics.!° Your apprenticeship in set theory will, 
however, give you a general idea of what counts as a possible structure or operation 
in today’s mathematics. That is a goal worth pursuing. We continue our pursuit of it 
in the next chapter. 


3.11 Solutions of Odd-Numbered Exercises 


3.1 Let x be the list of all lists that do not list themselves. Does x list itself? It does 
so only if it is one of the lists that do not list themselves. That is, it does only if it 
does not. So x does not list itself and, so, appears on the list of all lists that do not 
list themselves. But that list is x itself. We conclude that x both does and does not 
list itself. That is, x would do this impossible thing if it existed. So no such list exists. 


3.3 We are going to treat “forming a pedigree” as a property that people possess 
collectively. Some people FORM A PEDIGREE if and only if each parent of one of 
them is one of them. x is an ANCESTOR of y if and only if, given any people who 
form a pedigree, if y is one of those people, then so is x. 


3.5 Suppose w is a precursor of x and x is a precursor of y. Given any entry-closed 
lists, if y is one of them, so is x. Given any entry-closed lists, if x is one of them, 
so is w. So, given any entry-closed lists, if y is one of them, so is w. That is, w is a 
precursor of y. 


3.7 Let “the Y-lists” be y itself and y’s entries. Suppose w is an entry of x and x 
is one of the Y-lists. If x = y, then w is one of the Y-lists by virtue of being an 
entry of y. Suppose x is an entry of y. Then w is an entry of an entry of y and, 
hence, is an entry of y. So, once again, w is one of the Y-lists. We conclude that the 
Y-lists are entry-closed. Suppose z is a precursor of y. Given any entry-closed lists, if 
y is one of them, so is z. So z is one of the Y-lists. That is, z is either y or an entry of y. 


3.9 Suppose x is a precursor of Py. By Exercise 3.6, x is either Py (and, hence, a 
list) or a precursor of an entry of Py. Suppose the latter. Then x is a precursor of 
a part of y. By Exercise 3.6 again, x is either a part of y (and, hence, a list) or a 
precursor of an entry of a part of y. Suppose the latter. Then x is a precursor of an 
entry of y and, hence, by Exercise 3.5, is a precursor of y. If y is pure, x is a list. We 
conclude that all of Py’s precursors are lists. 


3.11 Since y is one of its own precursors, Sy lists y. Y lists nothing. 


'6 To see how unlikely that is, take a look at the “Mathematics Subject Classification” at http:// 
www.ams.org/msc/pdfs/classifications2010.pdf. Note that this document is 47 pages long. 
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3.13 18 is 10010 (1- 244+ 0-23 +0-27+1-2! +0. 2°) in binary. Here is the 
decoding key. 


List 100 List 11 List 10 List 1 List 0 
1 0 0 1 0 


We already know that List | is {4}. We need to figure out what List 100 is. Here 
is the decoding key for that. 


List 10 List 1 List 0 
1 0 0 


We know that List 10 is {{G}}. So List 100 is {{{@}}} since it says yes only to List 10. 
List 10010 says yes only to Lists 100 and 1. So it is {{{{@}}}, {O}}. 


3.15 Given any S-closed lists, if @ is one of them, then % is one of them. 


3.17 Suppose Sm = Sn. Since Sm lists m, so does Sn. By Theorem 3.7, m is either 
n or an entry of n. Since Sn lists n, so does Sm. By Theorem 3.7, n is either m or an 
entry of m. Suppose m ¥ n. Then m is an entry of an entry of m. So, by Theorem 3.6, 
m is an entry of itself—contrary to Exercise 3.16. 


3.19 We want to show that every number 7 has the following property: ifn 4 J, then 
n lists J. It is vacuously true that 4 has this property. As an inductive hypothesis, 
suppose 7 has the property. That is, 7 either is ¥ or lists J. So J is either an entry or 
an entry of an entry of Sn. By Theorem 3.6, J is an entry of Sn. 


3.21 Suppose Px = Py. Since Px lists x, so does Py. That is, x is a part of y. Since 
Py lists y, so does Px. That is, y is a part of x. So x and y have the same entries 
and, hence, by Axiom 3.1, x = y. 


3.23 It is vacuously true that / lists the parts of its entries. Suppose y is a part of an 
entry of Px. Then y is a part of a part of x and, hence, is a part of x. So Px lists y. 
This is one of those odd inductions without an inductive hypothesis. 


3.25 It is vacuously true that @ lists the entries of its entries. Suppose, as an inductive 
hypothesis, that x lists the entries of its entries. Suppose y is an entry of an entry of 
Px. Then y is an entry of a part of x and, hence, is an entry of x. By our inductive 
hypothesis, y’s entries are entries of x and, hence, y is a part of x. So y is an entry 
of Px. 
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3.27 By Theorems 3.10 and 3.11, PY is a super-number. So @ is hereditarily finite 
since PY lists 4. Suppose, as an inductive hypothesis, that n is hereditarily finite. 
Pick a super-number x that lists n. Suppose Sn lists y. Then, by Theorem 3.7, y is 
either 1 or an entry of n. That is, y is either an entry of x or an entry of an entry of x. 
By Exercise 3.25, y is an entry of x. We conclude that Sn is a part of x and, hence, 
is an entry of Px. This confirms that Sn is hereditarily finite if n is. 


3.29 By Theorem3.11 and Exercise 3.24, if super-number y lists x, then super- 
number Py lists Px. 


3.31 By Exercise 3.25, any entries of a hereditarily finite list are entries of a super- 
number. By Theorem 3.2 and Exercise 3.22, any entries of a super-number form a 
list. That list will be part of the super-number. So, if the super-number is x, the list 
will be an entry of Px. 


3.33 Suppose you are making a list and there are n things that could appear as entries. 
You are free to list all, some, or none of those n things. Once you finish your list, 
each of the n things will be in one of two possible states: on the list or not. Your list 
will record n decisions about two possible outcomes. So there are 2” lists you could 
make. If you start with 0 things, the number of possible lists is 2°: you can only 
produce the blank list #7. Repeat the process starting with the one blank list and the 
number of possible lists is 2'!: you could produce either % or {4}. Repeat the process 
starting with these two lists and the number of possible lists is 27. The number of 
lists that could appear at each stage of this process is given by the following function. 


a(O) = 0 
a(n+1) = 2% 


There is, however, a complication. The Listers do not reproduce lists that have already 
appeared. The following function makes the necessary correction. 


f(n) =a(n+ 1) — a(n) 


To complete stage 4, the Listers would have to list 65,520 lists. If they listed one 
list per second, this would take 18h and 12min. To complete stage 5, the Listers 
would have to list 2©936 — 65,536 lists. At one list per second, this would take about 
10!9-775 times the age of the universe. 


3.35 By Theorem 3.10, V (0) is a super-number. By Theorem 3.11, P V (7) is a super- 
number if V (7) is. So V(Sm) is a super-number if V (72) is. 
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3.37 It is vacuously true that the result holds when n = Q. As an inductive 
hypothesis, suppose 7 lists m only if V(n) lists V(m). Suppose Sn lists m. Since 
V(Sn) = PV(n), we want to show that V(m) is a part of V (1) (and, hence, an entry 
of PV(n)). By Theorem 3.7, m is either n or an entry of n. Ifm =n, then V(m) isa 
part of V(7) since every list is a part of itself. Suppose m is an entry of n. Then, by 
our inductive hypothesis, V(7) lists V(m) and, hence, by Exercise 3.25, V(m) is a 
part of V(7). 


3.39 It is vacuously true that the result holds when n = J. Suppose V (Sn) lists x. 
Then x is an entry of PV (n) and, hence, a part of V(). p(x) is the first number that 
behaves like this n. So p(x) is either n or an entry of n. By Theorem 3.7, p(x) is an 
entry of Sn. 


3.41 Suppose z outranks each entry of y. That is, p(z) lists p(x) whenever y lists x. 
By Exercise 3.37, V(p(z)) lists V(e(x)) whenever y lists x. x is a part of V(p(x)). 
So, by Exercise 3.23, V(p(z)) lists x whenever y lists x. That is, y is a part of 
V(e(z)). eC) is the first number 7 such that y is a part of V(7). So o(z) cannot be 
smaller than p(y). 


3.43 Pick a hereditarily finite list y and some hereditarily finite lists “the X’s” with 
the following property: if x is one of the X’s, then p(y) lists p(x). By Exercise 3.37, 
if x is one of the X’s, then V(p(y)) lists V(o(x)). Every hereditarily finite x is part 
of V(p(x)). So, by Exercise 3.23, if x is one of the X’s, then V(p(y)) lists x. We 
conclude that the X’s are all entries of V(o(y)). Theorem 3.2 and Exercise 3.22 now 
yield a list whose entries are exactly the X’s. Since that list is a part of V(p(y)), it 
is an entry of PV(e(y)) and, so, is hereditarily finite. 


References 


1. Ackermann, W. (1937). Die Widerspruchsfreiheit der allgemeinen Mengenlehre. Mathema- 
tische Annalen, 114, 305-315. 

2. Bennett, D. (2000). A single axiom for set theory. Notre Dame Journal of Formal Logic, 41, 
152-170. 

3. Deiser, O. (2011). An axiomatic theory of well-orderings. Review of Symbolic Logic, 4, 186— 
204. 

4. Edwards, H. M. (1977). Fermat’s last theorem: A genetic introduction to algebraic number 
theory. New York: Springer-Verlag. 

5. Frege, G. (1879). Begriffsschrift. Halle: Louis Nebert. 

6. Hallett, M. (1984). Cantorian set theory and limitation of size. Oxford: Clarendon Press. 

7. Kanamori, A. (2004). Zermelo and set theory. Bulletin of Symbolic Logic, 10, 487-553. 

8. Kaye, R., & Wong, T. L. (2007). On interpretations of arithmetic and set theory. Notre Dame 
Journal of Formal Logic, 48, 497-510. 

9. Landini, G. (2013). Zermelo and Russell’s paradox: Is there a universal set? Philosophia 
Mathematica, 21, 180-199. 

10. Mirimanoff, D. (1917). Les antinomies de Russell et de Burali-Forti et le probléme fondamental 

de la théorie des ensembles. L’Enseignement Mathématique, 19, 37-52. 


References 83 


11. 


12. 


13. 


14. 


Pollard, S. (1990). Philosophical introduction to set theory. Notre Dame IN: University of 
Notre Dame Press. 

Rang, B., & Thomas, W. (1981). Zermelo’s discovery of the ‘Russell paradox’. Historia 
Mathematica, 8, 15-22. 

van Heijenoort, J. (Ed.). (1967). From Frege to Gédel. Cambridge MA: Harvard University 
Press. 

Zermelo, E. (1908). Untersuchungen tiber die Grundlagen der Mengenlehre I. Mathematische 
Annalen, 65, 261-281. 


Chapter 4 
Zermelian Lists 


4.1 Infinite Lists 


Given any natural number n, our principles guarantee us a list with n entries. Indeed, 
in our construction, the number n is itself a list with n entries. 10! is a list with 
10! entries. Now we human beings are never going to list 10!°° things. We might 
describe them in some way; but we normally think of a description as an alternative 
to a list, not a kind of list. So our principles posit a world inhabited by infinitely 
many lists, infinitely many of which surpass, to an extravagant degree, our limited 
capacity to list things. We may not believe there is such a world; but the Ackermann 
interpretation discussed in the last chapter provides an excellent reason to believe 
there could be. As mathematicians, we have found it interesting and productive to 
explore what such worlds would be like. 

Having already left far behind what is humanly feasible in our pursuit of what 
is logically possible, we may feel comfortable with another leap. We already have 
infinitely many lists. How about a list with infinitely many entries? What might the 
entries be? Well, there are infinitely many hereditarily finite sets. Perhaps there could 
be a list of them. Let us try out that idea. 


Axiom 4.1 The hereditarily finite lists form a list. 


The list of all hereditarily finite lists is known as V(@). (‘@’ is the Greek letter 
omega.) As before, we claim no certain knowledge that our axiom is true, that there 
is such a thing as V(@). We are just letting everyone know we are interested in what 
the world would be like if there were. 


Exercise 4.1 Show that V (w) is pure. 
Exercise 4.2 Show that the numbers form a list. 


The list of all numbers is known as ‘w’. This may help to explain the name ‘V (@)’. If 
we start with J, then n applications of the operation P will give us V (7). If we start 
with Y, then w applications of the operation P will give us V (a). In conventional set 
theoretic terminology: 
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V(0) =9 
Vin+1)= PV(n) 


V(o) = U V(n) 


new 


Something is listed by V(@) if and only if it is listed by V(n) for some number n. 
You proved in Chap. 3 that the V(7)’s are the super-numbers. So V(@) is what you 
get when you list everything that appears in the construction of the super-numbers. 


V(w) = V0) UV(1) UV(2) U--» =U POU PPGU--- 


V(q) lists everything you obtain when you start with @ and complete w applications 
of the operation P, one application for each number 0, 1, 2,3,.... 

We need to consider whether our new system, consisting of Axioms 3.1, 3.2, 3.3, 
and now 4.1, is consistent. First, though, another question. Is it really helpful, does 
it really aid understanding, to continue to use the word “list” at this point? Hasn’t 
our everyday notion of list been stretched beyond recognition when we start talking 
about infinite lists? Better, perhaps, to avoid unnecessary confusion and resistance 
by saying good-bye to “list” and introducing a new term not so loaded down with 
preconceptions. 

Well, I am not sure one person can speak for everyone on the question of how 
flexible our concepts are. My concept of list might be far more flexible than yours. 
How then would we decide whether the concept has been stretched too far? Luckily, 
we do not need to reach any such decision. We already know that we will need to 
abandon “list” before long. We want to be understood by other English-speaking 
mathematicians; and they say “set” not “list.” We need to learn to say “set” too. We 
shall stick to “list” for just a few more pages. 

Now, is our new system consistent? It turns out to be impossible to interpret all four 
of our axioms as truths of arithmetic. Ackermann’s trick cannot help us here. But we 
still have good evidence that our system is consistent. Zermelo managed to describe 
a structure, V(@ + q), in which all our principles would be true.! Some of the most 
profound mathematicians of the last century thought deeply about that structure, gave 
strong indications of having formed a clear concept of it, and reported no signs of 
incoherence. That is not a proof of consistency. But proof is not the only kind of 
evidence available to mathematicians nor is it always the best kind of evidence. Our 
concern is that we might spend a lot of time seeing what follows from premises that 
turn out to be absurd. Since everything would be true in a logically impossible world, 
the painstaking accumulation of theorems is quite misguided when the axioms are 
incoherent. Ours might be; but we have very good reason to think they are not. Let 
us be bold and see what sort of world we are now contemplating. 

You might think of @ + @ as the result of the following construction. 


' See Zermelo [2]. 


4.1 Infinite Lists 87 
ao+0=a 
o+ Sn= S(w+n) 


o+o= [J@+n) 


new 


That is, 
O+to=(0@+0)U(@+ IU (@+2)U---=aU SoU SSoU:--- 


@ + lists everything you obtain when you start with w and complete w iterations 
of the operation S. 
You can think of V(@ + @) as the result of the following construction. 


V(w@+ 0) = Via) 
V(@+ Sn) = PV(w+n) 


Vo+o) =|) V@+n) 


new 


That is, 
V(wt+a) = V(wt+0)UV (m+ 1)UV(w+2)U--- = Viwa)UPV(w)UPPV(a@)U--- 


V(@ + @) lists everything you obtain when you start with V(@) and complete w 
iterations of the operation P. 

Our four axioms allow us to show that w + n and V (mw + n) exist for each number 
n. They do not allow us to show that either m + w or V(w + @) exist. Our four 
axioms and the axioms of Zermelo’s set theory Z can all be interpreted as statements 
about the entries of V(@ + @). That is, to make our axioms and Zermelo’s axioms 
true, you need only (1) complete w applications of the operation P starting from the 
blank list; (2) list every list you encounter in this construction; then (3) starting from 
that list, complete w additional applications of P. 


Exercise 4.3. Dream up an axiom that guarantees the existence of both w + w and 
V(@o+o). 


Exercise 4.4 Suppose x and y are parts of V(w). Show that there is a list x U y that 
lists, exactly, the entries of x and the entries of y. 


Exercise 4.5 Show that w lists the entries of its entries or, in other words, that every 
number is a list of numbers. 


Exercise 4.6 Suppose x is a part of w. Show that there is a list w\x that lists, exactly, 
the numbers that x does not list. 
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4.2 More Ranks 


We now continue the discussion of ranks we began in the previous chapter. We will 
assign ranks to lists in our new, larger universe. We can use induction to show that 


if n is a number, then there is a unique sequence of lists ag, a1, ..., Gn and a unique 
sequence of lists bo, bj, ..., b, with the following properties. 
aj =o 

ask = Sag 

bo = V(@) 

bsp = Pbx 
If we identify w + n with the last term a, in the sequence ao, a1, ..., d, and identify 
V(@ +n) with the last term b, in the sequence bo, bi, ..., b,, we can confirm that 


the lists a + 0,w + 1,... and V(m + 0), V(w+ 1),... have the properties we 
noted above: 
o+0=oa 


a+ Sn= S(w+n) 
V(S(@+n)) = PV(w+n). 


Assume this is so. (Or verify it yourself, if you want.) You can now show that the 
lists m + n are the lists we are going to call HIGHER RANKS: 


w,So@,SSo@.... 


The higher ranks are what tireless immortal beings encounter when they start with 
@ (the list of all numbers) and apply the operation S to everything they encounter. 


Exercise 4.7 Using Definition 3.7 as your guide, say what it means for something 
to be one of the higher ranks w, Sw, SSw@.... Confirm that the higher ranks are the 
lists @ +n. 


You can also show that the lists V(@ + n) are the lists we are going to call 
SUPER- DUPER- NUMBERS: 


Vi), PV(@), PPV(o),.... 
Super-duper-numbers are what tireless immortal beings encounter when they start 


with V (q) (the list of all hereditarily finite lists) and apply the operation P to every- 
thing they encounter. 
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Exercise 4.8 Using Definition 3.9 as your guide, say what it means for something 
to be a super-duper-number. Prove “super-duper” versions of Exercises 3.35 and 


3.36. 
Exercise 4.9 Show that every higher rank lists the entries of its entries. 


Since w+ Sn = S(@+n), Exercises 3.7 and 4.9 imply that the entries of w + Sn 
are w + n and the entries of w + n. 


Exercise 4.10 Show that every super-duper-number lists the entries of its entries. 
(You might consider a super-duper induction.) 


Exercise 4.11 Show that ifn ¢ 0, then w+n lists w. (You might consider induction 
on n.) 


Exercise 4.12 Show that n lists m only if @ +n lists w +m. (If you feel like another 
induction, you might consider doing it on n.) 


Exercise 4.13 Show that no higher rank lists itself. 
Theorem 4.1 [fw +n lists w +m, then n lists m. 


Proof Suppose w+n lists @+m. Suppose m lists n. Then, as you showed in Exercise 
4.12,@+m lists m+n. So, according to Exercise 4.9, + m will list itself, contrary 
to Exercise 4.13. It will also contradict Exercise 4.13 if m = n. So Theorem 3.9 
implies that 7 lists m. Oo 


We have confirmed that the listing relation orders the higher ranks the same way 
it does the numbers: 


wo+n lists ot+m —> n lists m. 


So, in fact, the listing relation well-orders the higher ranks. (Recall Sect. 3.6) In 
particular, if some higher ranks satisfy a certain condition, there will be a unique first 
higher rank that does so. 


Exercise 4.14 Use induction (on n) to show that n lists m only if V(@ + n) lists 
V(i@o+m). 


Exercise 4.15 Show that no super-duper-number lists itself. (Theorem 3.17 might 
be useful.) 


Exercise 4.16 Show that V(w +n) lists V(w +m) only if n lists m. 


We have confirmed that the listing relation orders the super-duper-numbers the 
same way it does the numbers: 


V(w@+n) lists Viwo+tm) —> n lists m. 


So, in fact, the listing relation well-orders the super-duper-numbers. 
Our RANKS are the numbers and the higher ranks. 
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Exercise 4.17 Show that if a and B are ranks, then V (f) lists V (a) if and only if B 
lists a. 


Exercise 4.18 Show that the listing relation well-orders our ranks. 
Definition 4.1 If x is a part of V(@) for some rank a, then p(x) is the first such a. 
Exercise 4.19 What is p(w)? 


Definition 4.2 A list is ZERMELIAN if and only if it is an entry of a super-duper- 
number.” 


We retain Definition 3.14 as our explanation of “outranking.” 


Theorem 4.2 One Zermelian list outranks another if and only if some Zermelian 
entry of the former outranks each Zermelian entry of the latter. 


Proof Zermelian sets have all the properties of hereditarily finite sets that we used 
in our proof of Theorem 3.19. (It might be a good exercise to double-check that.) O 


Definition 4.3 Some Zermelian lists are, collectively, of BOUNDED RANK if and 
only if a Zermelian list outranks all of them. 


Exercise 4.20 Show that any Zermelian lists of bounded rank will form a Zermelian 
list. 


Exercise 4.21 Confirm that there is a Zermelian list with no Zermelian entries. 


Exercise 4.22 Confirm that Zermelian lists with the same Zermelian entries are the 
same. 


Exercise 4.23 Confirm that our ranks are Zermelian and list only ranks. 


Exercise 4.24 Show that no Zermelian list lists every rank. (You might want to look 
at the proof of Theorem 3.17.) 


It may not be obvious why Theorem 4.2 and Exercises 4.20-4.24 are of any 
particular interest. Be patient. They will play a key role in our exploration of set 
theory in Chap. 5. Before turning to sets, however, we will use our profusion of lists 
to develop the foundations of calculus. 


? The entries of the super-duper-numbers form a model of Zermelo’s set theory Z. For a helpful 
discussion see Uzquiano [1] (http://www.jstor.org/stable/421182). 
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4.3 Real Numbers 


We have been a little sloppy, saying “numbers” when we mean the natural numbers 
0, 1,2,3,.... We are going to continue this practice even though we know perfectly 
well that mathematicians recognize other numbers. It is only our dialect that is 
quirky, however. Our theory does not confine us to the naturals. Indeed, our theory 
lets us show that some parts of w, when suitably ordered, behave just like the (non- 
negative) real numbers. Before I can say which parts, I need to introduce some new 
terminology. Parts of w that omit only finitely many numbers are said to be cofinite. 
A part of @ that omits only finitely many numbers will have a largest non-entry if it 
has any non-entries at all. 


Definition 4.4 A part A of w is COFINITE if and only if w\A has a largest entry or 
has no entries. (Recall that w\ A is the list of numbers not listed by A.) 


w itself is cofinite because it omits no numbers. 4 is not cofinite because it omits 
every number. The list of all numbers of the form 6a + 9b + 20c is cofinite because 
it lists every number larger than 43. The list of all numbers of the form 6a + 9b is 
not cofinite because it omits 3 and every number that is not a multiple of 3. We are 
going to focus on lists of this latter sort: the non-cofinite ones. If A is one of these 
lists, then w\ A has entries, but no largest one. 


Exercise 4.25 Show that each entry of w (each number) is a non-cofinite part of w. 


Definition 4.5 An entry DISCRIMINATES between two lists if and only if it appears 
on one but not the other. Two lists AGREE on an entry if and only if it does not 
discriminate between them (that is, if and only if it appears on both or neither). 


Definition 4.6 If A and B are non-cofinite parts of w, we say A < B (A PRECEDES 
B) if and only if A ~ B and B lists the smallest number that discriminates between 
A and B? 


Note that we are placing the non-cofinite parts of w in a certain order, but we are not 
comparing their sizes. For example, {1, 3} has more entries than {0}. Yet {1, 3} < {O}, 
since {0} lists the smallest number that discriminates between these two lists. Indeed, 
the list of all odd numbers precedes {0} even though the former has infinitely more 
entries than the latter. 


Exercise 4.26 Show that one non-cofinite part of w precedes all the others 


Exercise 4.27 Show that if A is a non-cofinite part of w, then A < n for some 
number n. 


3 Here is another way of expressing the same idea. If A is a part of w and j is an entry of , let 
A j be the list whose entries are, exactly, the entries of A smaller than (i.e., listed by) 7. Now 
suppose A and B are non-cofinite parts of w. Then A < B if and only if, for some number j, B\A 
lists j and (AN j) = (BN j). 
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Exercise 4.28 Show that ifm and n are non-zero numbers, then {m} <n. 


Exercise 4.29 Show that ifm and n are numbers, then m <n if and only if m is 
smaller than n (that is, n lists m). 


Exercise 4.30 Suppose k is anumber and A is a non-cofinite list of numbers. Show 
that if A < {k}, then A < {k+1,k+2,...,k +n} for some number n. 


My statement of the last exercise is not really kosher because I have not said what 
‘+’ means when applied to our numbers. Here is a more correct version: if A < {k}, 
then A < j\Sk for some number j. Recall that Sk = {0,1,2,...,k}. So j\Sk is 
the result of lopping off all the numbers 0, 1, 2,..., k from the front of j. That is, 
J\Sk has the form {k +1,k+2,...,k-++-n} as in Exercise 4.30. Note that we obtain 


A < j\Sk < {k} 


no matter how close A is to {k}. So we can make j\Sk arbitrarily close to {k} by 
picking a sufficiently large j. Or, to put it differently, if a list B squeezes between 
J\Sk and {k} 

A < j\Sk < B < {k} 


then Exercise 4.30 allows us to leapfrog it by picking a big enough j’: 
A < j\Sk < B < j'\Sk < {k}. 


We now turn to some particularly important facts about the relation <. A, B, and C 
are understood to be non-cofinite parts of w. 


Theorem 4.3 IRREFLEXIVITY: A ¢ A. 
Proof There is no way that A # A. oO 
Theorem 4.4 TRANSITIVITY: [fA < B <C, then A <C. 


Proof Let j be the smallest number that discriminates between A and B. Let k be the 
smallest number that discriminates between B and C. Then A and B agree on every 
number smaller than j, while B and C agree on every number smaller than k. B lists 
j, but A does not. C lists k, but B does not. Note that j 4 k. This leaves two cases. 
Case |: j is smaller than k. Then, since B lists 7, so does C. Since B and C agree on 
every number smaller than j, so do A and C. That is, 7 is the smallest number that 
discriminates between A and C. Case 2: k is smaller than j.4 Then, since B does not 
list k, neither does A. Since A and B agree on every number smaller than k, so do A 
and C. That is, k is the smallest number that discriminates between A and C. oO 


Exercise 4.31 Show that < is CONNECTED. That is, if A 4 B, then either A < Bor 
B<A. 


4 Yes, this is possible. Note that 4 < {2} < {1, 2}. 
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Theorem 4.5 DENSENESS: If A < C, then A < B < C for some finite B. 


Proof Let j be the smallest number that discriminates between A and C. Since A is 
not cofinite, we can let k be the smallest number larger than 7 that A does not list. 
Let B be the result of adding k to A while deleting all entries larger than k. Then B 
lists the smallest number that discriminates between A and B (namely, k) and C lists 
the smallest number that discriminates between B and C (namely, /). Oo 


I need to prove one more fact about <. First, though, some preliminaries. We let 
[ 4, w) be the list of all non-cofinite parts of w. The idea is that these parts form a ray 
with first entry J, but no last entry. We let [¥, A) be the list of non-cofinite parts of 
w that precede A. The interval [@, A ] will be the result of adding A to [ @, A). 


Definition 4.7 Suppose M is part of [%, @). A is an UPPER BOUND of M if and only 
if M is part of [¥, A ]. If, in addition, no upper bound of M precedes A, then A is 
M’s LEAST UPPER BOUND. 


Now for our last fact about our ordering of [%, w). 


Theorem 4.6 COMPLETENESS: Every part of [Y, w) with an upper bound has a least 
upper bound. 


Proof \ will omit some details. Suppose M is a part of [ 4, @) with an upper bound. 
Since, by Exercise 4.27, each such upper bound precedes a number, some number is 
an upper bound of M. Suppose | is the first such number. (You can check later to see 
whether our reasoning about this case applies more broadly.) If M lists 1, then none 
of M’s upper bounds can precede | and, hence, | is M’s least upper bound. Suppose 
M does not list 1. Then all of M’s entries precede 1. We can use induction and some 
other tricks to show that if m is any number, then w has parts Ao, Aj,..., An with 
the following properties. First: 
Ao = 9. 


Second: 


A _ | Ag if every entry of M precedes Ax U {k} 
k+l = | A, U {k} otherwise. 


Let A be the part of w whose entries are the numbers that appear on one of the lists 
Ay. In more conventional terminology: 


A= [J An. 


new 


We are trying to creep up on M’s least upper bound from below. We exclude from 
our big list A any number that would take us beyond all of M’s entries. For example, 
we have the opportunity to add 0 at stage 1 of our construction, but decline to do so. 
Instead, A; = Ag = @ since 
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(Ao U {0}) = UV {0}) = {0} = 1 


and all of M’s entries precede 1. Suppose, on the other hand, that some of M’s entries 
do not precede {1}. Then adding | to our big list A would not take us too far. So we 
would say 

Az = (A, U {1)) = @U (1) = {1} 


and, hence, A would list 1. What happens as we move down the list? Could A list 
one number after another with no further omissions? First note: if none of M’s upper 
bounds precede 1, then 1 is M’s least upper bound and we are done. Suppose, then, 
that B is an upper bound that precedes 1. That is, B < {0} = 1. By Exercise 4.30, 
B < {1,2,...,n}forsome number zn. All of M’s entries would precede {1, 2,..., n}. 
So, even if A listed every number from | to n — 1, it would omit n. We conclude that 
A omits a number greater than 1. Let k be the first such number. Then, 


{1,2,...,4 — 1} = Ay = Ags. 


{1,2,...,k} is an upper bound of M since all of M’s entries precede Ax U {k}. If 
none of M’s upper bounds precede {1,2,...,k}, we are done. On the other hand, 
if an upper bound does precede {1, 2,..., k}, we can repeat the above argument to 
show that A omits a number greater than k. (Note, in particular, that 


HD ck = 1) SP <1. 


only if 
B<({1,2,....kK-1,k4+1,...,k+ J} 


for some number j.) This holds quite generally: after each omission there will be 
another omission. So A omits numbers, but there is no largest number it omits: A is 
not cofinite. Is A an upper bound of M? Suppose A < C where C is an entry of M. 
Let i be the first number listed by C but not by A. Then (A; U {j}) < C whenever 
i < j since C will list i but A; U {j} will not. So 


Ajai = (A; UTA) 


whenever i < j. Since this would make A cofinite, we conclude that each entry of M 
either precedes A or is A. That is, A is an upper bound of M. Suppose B is an upper 
bound of M that precedes A. Then A lists the smallest number that discriminates 
between A and B. Suppose this number is m. Then not all of M’s entries precede 
Am U {m}, but B does precede A,, U {m}. If Ay U {m} is itself an entry of M, then 
B is not an upper bound of M. If Aj», U {m} precedes an entry of M, then B is not an 
upper bound of M. So B is not an upper bound of M after all. We conclude that, in 
fact, no upper bound of M precedes A. Oo 
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Exercise 4.31 and Theorems 4.34.6 state the characteristic properties of a linear 
continuum. We have arranged the non-cofinite parts of @ to look like the real numbers 
in the ray [0, oo). Since they form the same structure as the non-negative reals, it 
should not lead to any great confusion if we call them “real numbers.” Under the 
Mirimanoff construction, each natural number is a non-cofinite part of w (as you 
showed in Exercise 4.25). So all our natural numbers are real numbers. Which real 
numbers are they? Actually, a better question would be: which real numbers would we 
like them to be? We have infinitely many alternatives. One alternative is particularly 
attractive: we can let each natural number be itself. That is, we can let each natural 
number 7 be the real number 1. So, for example, {0, 1, 2, 3} can be the real number 
4, while {0, 1, 2, 3, 4} can be the real number 5. 

If a non-cofinite part of w happens to be a Mirimanoff number, we can now say 
what real number it is. (Itself!) What about all the other non-cofinite parts? Take 
{0, 1, 2,3, 4, 7, 10}, for example. To determine the integer part of this real number, 
count how many consecutive numbers are listed starting from 0. That would be 5 (0 
through 4). We now subtract the remaining numbers, one by one, from 5, raise 2 to 
each of those powers, and add the results to 5. 


5 425-7 4. 25-10 = 5 4.9-2 4. 9-5 = 5.28125. 


So {0, 1, 2, 3,4, 7, 10} is the real number 5.28125. What about {1, 2, 3, 4, 7, 10}? 
The technique we just described yields the following. 


0 4+ 29-1 4 20-2 4 20-3 4 90-4 4 90-7 4 90-10 _ 9 946289, 


The integer part is 0 because, in {1, 2, 3,4, 7, 10}, there are 0 consecutive numbers 
starting from 0. Note that Exercise 4.30 has the following special case. If A < 1, 
then 


for some number n. 

The proof of Theorem 4.6 may make a bit more sense now. Suppose M lists all 
the rational numbers whose square is less than 2. Then our construction of the lists 
A, begins as follows: 


A;= {0} 
Az,= {0} 
A3= {0} 
Ag= {0,3} 
As = {0, 3, 4}. 


We add 0, 3, and 4 to our list because they do not take us beyond the square root of 2: 
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{0} = 1 < J/2 
{0,3} = 1+5 < v2 
0,244 tele 


We omit | and 2 because they do take us beyond the square root of 2: 


(is 2 s/f 
{0,2} =1+5 > 2. 


So As is the real number 
1 1 
1+-+- 
a3 4 8 


whose square is 1.890625. The idea is to add any power of 5 that does not give our 
sum a square greater than 2. Skipping ahead: 


Ao = {0, 3, 4, 6, 8). 


So Ag is the real number 
1+ : + : =F : Si : 
4 8 32 128 


whose square is about 1.99957. The limiting value A is the smallest real number that 
surpasses all the entries of M, that is, all the rationals less than »/2. More briefly, 
A=¥2. 


Exercise 4.32 What real number is {0, 1, 2, 6, 9, 14, 15, 16, 17, 18, 19}? 


Exercise 4.33 Show that the list of all odd numbers {1, 3,5, 7, .. .} is the realnumber 
2. Uf you get stuck, look up some facts about geometric series.) 


As I noted a couple pages back, [ 0, w) (ordered by <) looks like [ 0, co) (ordered 
by the usual “less than” relation). Why is this of any interest? Well, first, it tells us 
something about quantity: w has at least as many parts as there are non-negative 
reals. (It has, in fact, exactly as many parts; but showing that requires a bit more 
argument.) It also tells us something about logical possibility: if our theory of lists is 
consistent, then our concept of a linear continuum is not fundamentally incoherent. 
On a more philosophical note, we have also taken a step toward establishing the 
unity of mathematics. Since theorems about the real numbers can be interpreted as 
theorems about lists, mathematicians who study the reals will, whether they know 
it or not, be helping us see what follows from our theory of lists. The set theoretic 
reductive program of the last century showed that virtually all mathematicians are 
working out the consequences of a single comprehensive theory. So even if you find 
your colleague’s work quite incomprehensible, you can be confident you are laborers 
in the same vineyard. 


Exercise 4.34 What would go haywire if we dropped the qualification “non-cofinite” 
from our definition of the real numbers? 
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4.4 From Lists to Sets 


The Oxford English Dictionary (OED) recognizes two substantive uses of the term 
“set.” The first (“‘set’ as in “sunset’’) is remote from the mathematical sense. The 
second is somewhat closer: a number of things or persons set or placed together. 
For those who are curious about what this sort of “setting” might involve, the OED 
helpfully provides a discussion of the verb “to set” occupying more than six thousand 
lines! None of the twelve types of “setting” recognized by the OED involve an 
operation that would yield the sets of mathematics. This supports the following 
bland observation: mathematicians have borrowed the everyday word “‘set” and are 
using it as a technical term. 

Many terms borrowed in this way acquire meanings quite distant from everyday 
language and experience. This does not mean it has to be an ordeal to learn the 
technical vocabulary. Some people just “get it” right away, as if they had a deep- 
seated predisposition to learn (or invent) the stuff. Others do just fine with the help 
of some analogies or similes or other little nudges. Set theory is treacherous ground 
because we innocently expect our everyday understanding of sets to be a rich source 
of helpful nudges and signposts. It is not. 

Luckily, we can find help elsewhere: everyday language and experience make 
it easy for us to talk sensibly about unranked list-types; and unranked list-types 
behave (or can easily be imagined to behave) very much like mathematical sets. 
Mathematical sets are not part of our everyday world. Some may even find unranked 
list-types a bit weird. Not to worry: paper lists are certainly familiar. Our experience 
with paper lists makes it easy for us to gab about unranked list types. At that point, 
set theory itself is within walking distance. This is a blessing: it offers a smooth 
path into the heart of modern mathematics. The various forms of hoodwinking we 
professors have been practicing for years are, it turns out, entirely unnecessary. 


4.5 Solutions of Odd-Numbered Exercises 


4.1 By Exercise 3.25, entries of hereditarily finite lists are hereditarily finite. So V (w) 
lists the entries of its entries and, hence, by Exercise 3.7, the precursors of V(@) are 
V (@) and the entries of V(@). But V(@) and its entries are all lists. 


4.3 You might assume there is a list that lists V(@) and, furthermore, lists Px when- 
ever it lists x. 


4.5 Each entry of J is anumber. Suppose 7 is anumber whose entries are all numbers. 
By Theorem 3.7, the entries of Sv are n and the entries of n. So every entry of Sn is 
a number. We conclude that entries of numbers are numbers and, hence, w lists the 
entries of its entries. 
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4.7 Say that a list x is a HIGHER RANK if and only if it satisfies the following con- 
dition: given any S-closed lists, if w is one of them, then so is x. This immediately 
yields an induction principle: if some lists are S-closed and w is one of them, then 
every higher rank is one of them. Our definition also guarantees that @ is a higher 
rank. Our first job is to show that if x is a higher rank, then x = w +n for some 
number n. This is an easy induction because mw = w + 0 and, if x = w +n, then 
Sx = S(w+n) = w+ Sn. As for the converse, we need to show that m+n is a higher 
rank whenever n is a number. w + 0 is a higher rank because w is. Suppose w + n is 
a higher rank x. Then w + Sn = S(@+n) = Sx. But our definition guarantees that 
Sx is a higher rank whenever x is. 


4.9 In Exercise 4.5, you proved that w lists the entries of its entries. Suppose x lists 
the entries of its entries. Then, according to Exercise 3.7, the precursors of x are x 
itself and x’s entries. Suppose Sx lists y and y lists z. Then y is either x itself or an 
entry of x. In either case, z is a precursor of x and, so, is listed by Sx. That is, Sx 
lists the entries of its entries. 


4.11 It is vacuously true that w + 0 has the desired property. Suppose w + n lists w. 
Since w +n is an entry of S(w +n) and S(w +7) = w+ Sn, w +n is an entry of 
@ + Sn and, hence, by Exercise 4.9, w is an entry w + Sn. 


4.13 If w listed itself, it would be a number and, hence, a number would list itself— 
contrary to Exercise 3.16. Suppose x is a higher rank that does not list itself. Sx lists 
the precursors of x and, so, by Exercises 3.7 and 4.9, Sx lists x and the entries of x. 
Note that Sx # x since x does not list x. If Sx were an entry of x, then x would be 
an entry of an entry of x—contrary to Exercise 4.9. So Sx is neither x nor an entry 
of x and, so, does not list itself. 


4.15 By Exercise 3.27, V(w) lists every number. If V(@) listed itself, it would be 
hereditarily finite and, hence, a hereditarily finite list would list every number— 
contrary to Theorem 3.17. Suppose Px lists itself. Then Px is a part of x and, hence, 
x is an entry of x since x is an entry of Px. We conclude: if x does not list itself, 
then neither does Px. 


4.17 If w and # are both numbers, we apply Exercises 3.37 and 3.38. If a and 6 
are both higher ranks, we apply Exercises 4.14 and 4.16. Suppose @ is a number 
and f is a higher rank. By Exercises 4.9 and 4.11, 6 lists a. By Exercises 3.19 
and 4.14, V(B) either lists V(w) or is V(w). V(q@) is hereditarily finite since it is 
an entry of V(Sa). So V (@) lists V(@) and, hence, by Exercise 4.10, V (8) lists V(q@). 


4.19 By Exercise 3.27, w is a part of V(w). If w were a part of V(m), n a number, 
then w would be an entry of V(n + 1)—contrary to Theorem 3.17. So w does not 
list any rank @ such that w is a part of V(@) and, hence, p(w) = w. 
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4.21 @ is hereditarily finite because it is an entry of the super-number PY. So G is 
an entry of the super—duper-number V (w) and, hence, is Zermelian. Since 4 has no 
entries, it has no Zermelian entries. 


4.23 By Exercise 3.27, each number is an entry of V(@) and, hence, is Zermelian. 
We are going to use induction to confirm that each higher rank w + n is an entry 
of V(@ + Sn) and, hence, is Zermelian. Since p(@) = w (Exercise 4.19), w is an 
entry of V(w + SO). As an inductive hypothesis, suppose V(@ + Sn) lists w + n. 
By Exercises 3.7, 4.9, and 4.10, every precursor of w + n is an entry of V(@ + Sn). 
So S(w +n) is a part of V(@ + Sn) and, hence, is an entry of PV(w + Sn). But 
S(@+n) = (w+ Sn) and PV(w + Sn) = V(w + SSn). So m + Sn is an entry 
of V(w + SSn). Now we need to confirm that every rank is a list of ranks. 4 only 
lists ranks because it lists nothing. w only lists ranks because it only lists numbers. 
Suppose rank @ only lists ranks. By Theorem 3.6 and Exercises 3.7 and 4.9, Sa only 
lists @ and the entries of a. So Sw only lists ranks. 


4.25 @ is a part of w because it is a part of every list. w\% is w. w has no largest 
entry because each number n is an entry of Sn. So J is a non-cofinite part of w. As 
an inductive hypothesis, suppose number 7 is a non-cofinite part of w. By Theorem 
3.6 and Exercise 3.7, Sn is a part of w. Suppose m is the largest entry of w\Sn. 
Then Sv lists Sm but does not list m—contrary to Theorem 3.6. We conclude that 
Sn is non-cofinite. 


4.27 Since A is non-cofinite, w\A has entries. By Theorem 3.8, we can let k be the 
smallest of those entries. k is the smallest number not listed by A. k is listed by Sk. 
So Sk lists the smallest number that discriminates between A and Sk. Thatis, A < Sk. 


4.29 By Exercise 3.16, m does not list m. m lists all smaller numbers (since “smaller 
than” means “listed by”’). So m is the first number not listed by m. If n lists m, then 
n lists the smallest number that discriminates between m and n. On the other hand, 
suppose n lists k but m does not. Suppose k 4 m. By Theorem 3.9, k lists m and, 
hence, by Theorem 3.6, n lists m. 


4.31 Suppose A # B. By Axiom 3.1, A and B do not have the same entries and, 
hence, there are numbers that discriminate between them. By Theorem 3.8, there is 
a smallest such number. The list that does not list that number will precede the one 
that does. 


4.33 
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Chapter 5 
The Hierarchy of Sets 


5.1 New Axioms 


We will now rejoin the rest of the mathematical community and say “set” rather than 
“list”, “member” rather than “entry”, “empty set” rather than “blank list’, “subset” 
rather than “part”, and so on. We are also going to start afresh with new axioms 
characterizing ranks and their relationship to sets. We will not, however, forsake our 
friends the lists. We will use our old list theory to show that our new set axioms can 
be interpreted as claims about lists. We will also use our new set theory to show that 
our old list axioms can be interpreted as claims about sets. This may seem an odd 
way to spend our time, first going to the trouble of introducing new material and then 
going to even more trouble to show that this material is not so new after all. What 
a letdown to roll out shiny new vocabulary and axioms and then reveal that the new 
stuff is just an offbeat way of talking about stale old stuff! 

Well, the situation is hardly as bad as all that. First, itis acommonplace that you can 
get fresh insights by approaching something familiar from a new direction. But, sec- 
ond, there is a more specific reason for our preoccupation with interpretability. Sort- 
ing out relations of interpretability (what is interpretable as what) is a mathematical 
enterprise that helps us understand how various bits and pieces fit together to form 
the great edifice of mathematics. Interpretation is a mathematical tool for exploring 
the architecture of mathematics. For example, if we show that all the settled results 
of mathematical field F are interpretable as assertions derivable from theory T, we 
can conclude that T provides a foundation for F. Among other things, this would 
mean that all the ideas and assumptions deployed in F' can be reconstructed within 
T (just as, in the previous chapter, we reconstructed the principles of real number 
analysis inside our list theory). So, even if an evil mathematics czar outlawed every- 
thing that bears the outward stamp of F’, the mathematically essential insights and 
methods of F might be preserved in 7. If we showed that all settled results in every 
mathematical sub-field can be interpreted as assertions derivable from T, we could 
conclude that T provides a foundation for all of mathematics. Philosophers who want 
to understand mathematics need to understand these architectural analyses: both how 
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they are performed and what they reveal. So another round of interpretability results 
seems warranted. First, though, we need our new axioms. Here are some definitions 
to prepare the way. 


Definition 5.1 One rank PRECEDES another if and only if the first is a member of 
the second. 


In symbols, using the Greek letter epsilon (‘e’) to express the membership relation 
and, as in the last two chapters, letting the Greek letter rho (‘o’) stand for the rank 
function: 


p(x) precedes p(y) if and only if p(x) € p(y). 


When we discuss the ordering of our ranks, we are talking about which ranks are 
members of which. This should not seem so strange after your experience with list- 
theoretic numbers that were ordered by the listing relation. When we discussed the 
ordering of our numbers, we were talking about which numbers were entries of 
which. 


Definition 5.2 A set y OUTRANKS a set x if and only if o(x) precedes p(y). 


Theorem 3.19 is the inspiration for our first new axiom. We are going to drop the 
business about hereditary finiteness and translate the list talk into set talk. Further- 
more, we are going to turn our reasoning process upside down, treating our former 
theorem as a fundamental premise, an axiom, and deriving facts about sets from it. 
It will simplify our task if we describe a universe inhabited only by sets. Within the 
confines of our theory, claims about “every set” will be assertions about everything 
in our limited universe. 


Axiom 5.1 One set outranks another if and only if some member of the former 
outranks each member of the latter. 


We will adopt more axioms after some preliminary definitions. 


Definition 5.3 Some sets FORM a set if and only if the former sets are exactly the 
members of the latter set. 


If sets x and y were to form a set, it would be {x, y}, the set whose members are x 
and y. 


Definition 5.4 Some sets are, collectively, of BOUNDED RANK if and only if there 
is a set that outranks all of them. 


Axiom 5.2 Any sets of bounded rank will form a set. 


Axiom 5.3 There is a set with no members. 


' This clever idea comes from Van Aken [9] (http://www.jstor.org/stable/22739 11). For a discussion 
of our axioms, see Pollard [6], pp. 161-164, and [7]. 
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How many sets have no members? The next axiom tells us there is exactly one. 
Axiom 5.4 Sets with the same members are the same. 


We can now let % be the one and only memberless set. Axioms 5.3 and 5.4 do not 
say anything new or amazing. They are just Axioms 3.1 and 3.2 translated into the 
language of sets. 


Exercise 5.1. Show that 6 outranks no set. Show that every set other than © out- 
ranks 0. 


After yet another definition, we will use our axioms to prove a fundamental theorem. 


Definition 5.5 Some sets, the X’s, are collectively CLOSED UNDER SET FORMATION 
if and only if any set formed by some of the X’s is itself one of the X’s. 


If there is such a set as {x, y} and this set is distinct from both x and y, then x and 
y are not closed under set formation: x and y form {x, y}, but {x, y} is neither x nor 
y. You have to go beyond x and y to encounter the set formed by x and y. 


Theorem 5.1 Jf some sets are closed under set formation and % is one of them, then 
every set is one of them. 


Proof Suppose some sets are closed under set formation. Let’s call them “the V -sets” 
or “the V’s”. All the other sets will be “the non-V’s”. Suppose Y is one of the V’s, 
while x is one of the non-V’s. Note that every non-V outranks J. Furthermore, ¥ is 
not a member of itself (since @ has no members at all). So there is at least one set 
that (1) is outranked by every non-V and (2) is not a member of itself. By Axiom 
5.2, all the sets satisfying these conditions will form a set. Let y be this set. Now 
consider x: our representative of the non-V’s. If all of x’s members were V-sets, 
then x would be a V-set too (since the V-sets are closed under set formation). So 
we can let w be a member of x that is a non-V.* Since w outranks every member of 
y, Axiom 5.1 guarantees that x outranks y. The only information we used about x 
is that it is a non-V. So we can conclude that every non-V outranks y. If y is nota 
member of itself, then it satisfies both conditions for membership in y and, hence, is 
a member of itself. But if y is a member of itself, it must satisfy the second condition 
for membership in y and, hence, is not a member of itself. So y is a member of itself 
if and only if it isn’t. Since this is absurd, we must have gone wrong somewhere. Our 
misstep was back at the start when we assumed that there are non-V’s. 


The preceding theorem is known as the principle of EPSILON- INDUCTION. If you 
can show that # has an infection and that every set with infected members is infected, 
then you can show that every set is infected. 


? What if there were non-sets in our universe? Then we would have no guarantee that w is a set. 
Since non-V’s are sets that are not V’s, we would have no reason to believe that w is anon-V. On 
the other hand, if we allowed non-sets to be non-V’s, we could not have inferred that every non-V 
outranks %. 
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Exercise 5.2. Use epsilon-induction to show that every set outranks its members. 
Exercise 5.3 Use epsilon-induction to show that no set outranks itself. 

x € x only if p(x) € p(x). So no set is a member of itself. 

Theorem 5.2 [fx outranks y and y outranks z, then x outranks z. 


Proof Exercise 5.1 confirms that the theorem is true when z = J. (If x outranks 
y, then x 4 G and, hence, x outranks 4.) As an inductive hypothesis, suppose the 
theorem comes out true whenever we let z be a member of set s. We want to show 
that the theorem is true when z = s. Suppose, then, that x outranks y and y outranks 
s. By Axiom 5.1, we can let x’ be a member of x that outranks every member of y, 
while y’ is a member of y that outranks every member of s. Note that x’ outranks y’. 
So, by our inductive hypothesis, x’ outranks every member of s. Hence, by Axiom 
5.1, x outranks s. Since J satisfies the theorem and the sets satisfying the theorem 
are, collectively, closed under set formation, we conclude that every set satisfies the 
theorem. 


Definition 5.6 a is a RANK if and only if a = p(x) for some set x. 
In what follows, we will assume that a, 6, y are ranks. 

Theorem 5.3 IRREFLEXIVITY: a ¢ a. 

Theorem 5.4 TRANSITIVITY: Ifa € B and B € yy, thena € y. 


Theorem 5.5 WELL-FOUNDEDNESS: Given any ranks at all, one will be preceded 
by none of the others. 


Proof Pick some ranks and call them “the X-ranks’”. All other sets will be “non-X’s”. 
If # is an X-rank, then an X-rank is preceded by no X-rank, as desired. Suppose 4 
is anon-X. If, in addition, a set is a non-X whenever its members are non-X’s, then 
every set is anon-X, contrary to our assumption that there are X-ranks. So there must 
be an X-rank whose members are all non-X’s. But this means there is an X-rank 
preceded by no X-ranks. 


Theorem 5.6 Jf x outranks every member of y, then y does not outrank x. 


Proof Suppose x is a counter-example to the theorem. This means we can pick a y 
that satisfies the following condition: 


x outranks every member of y even though y outranks x. 


By Axiom 5.1, some member of y outranks every member of x. This means we can 
pick a member z of y that satisfies the following condition: 


z outranks every member of x even though x outranks z. 
Such a z would be a counter-example to the theorem outranked by our original 


counter-example x. Since this means no counter-example has minimal rank, it would 
contradict Theorem 5.5 for there to be any counter-example at all. 
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Recall that, in the Mirimanoff construction 0, S@, SSG, ..., each number is the 
list of prior numbers. We now assume that ranks have a similar property. 


Axiom 5.5 Every rank is a set of ranks. 


Exercise 5.4 Show that @ is a rank. (You might consider whether p(@) has any 
members.) 


Theorem 5.7 CONNECTEDNESS: Given any two ranks, one will precede the other. 


Proof Say that a rank is DISCONNECTED if some other rank neither precedes nor is 
preceded by it. Suppose there is a disconnected rank. Then we can use Theorem 
5.5 to pick a disconnected rank a preceded by no disconnected rank. We say @ is a 
MINIMAL disconnected rank. Since @ is disconnected, there is a rank distinct from a 
that neither precedes nor is preceded by a. Use Theorem 5.5 to let y be a minimal 
such rank. Ifa rank £ precedes y, 6 will either precede or be preceded by a. (8 cannot 
be a because a would then precede y .) If 8 is preceded by a, then, by Theorem 5.4, 
a will precede y. (a € BE y >a ey.) So B will precede a and, more generally, 
every rank that is a member of y will be a member of a. According to Axiom 5.5, 
this means that every member of y is a member of a. Similar reasoning shows that 
every member of a is a member of y. So, by Axiom 5.4, a = y, contrary to our 
assumption that they are distinct. We conclude that there is no disconnected rank. 


Theorem 5.8 p(a) =a. 


Proof Suppose a is the first rank that violates the theorem. Let a = p(x). Then: 


Bea = p(B)=8B = pl(Byca = > pl(B)€ pl). 


So x outranks every member of a and, hence, by Theorem 5.6, a does not outrank 
x. That is, p(x) ¢ e(a) and, hence, a ¢ p(a). Suppose p(a) € a. Then, by the 
minimality of a, p(e(@)) = e(a). Furthermore, by Exercise 5.2, p(p(a@)) € p(@). 
So p(@) € p(a@), contrary to Exercise 5.3. We conclude that p(a~) ¢ a. So, by 
Theorem 5.7, o(@) = a. This means no counter-example to the theorem can have 
minimal rank. So Theorem 5.5 guarantees there are no counter-examples at all. 


Exercise 5.5 Show that p(p(x)) = p(x). 
Exercise 5.6 Show that B outranks a if and only ifa € B. 


Definition 5.7 A set x is a SUBSET of a set y (in symbols, x C y ) if and only if 
every member of x is a member of y. 


Exercise 5.7 Show that ifa #4 B anda C 8, thena € B. 


Definition 5.8 A set x is TRANSITIVE if and only if every member of x is a subset 
of x (that is, z € y © x only ifz ex). 


Exercise 5.8 Show that every rank is transitive. 
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Axiom 5.5 and Exercise 5.8 guarantee that every rank is a transitive set of ranks. 
Now we want to prove the converse. 


Exercise 5.9 Show that if x is a set of ranks, then x C p(x). 
Theorem 5.9 Every transitive set of ranks is a rank. 


Proof Suppose x is a transitive set of ranks. We want to show that x has the same 
members as p(x). Suppose a € p(x). Then, by Theorem 5.8, p(@) € p(x). That is, 
x outranks a. So, by Theorem 5.6, ~ cannot outrank every member of x. Suppose f 
is amember of x not outranked by a. Then, by Exercise 5.2, B ¢ a. So, by Theorem 
5.7, either a € 6 ora = 8. In the each case, a € x. (In the first case, because x is 
transitive.) We conclude that p(x) C x. Now apply Exercise 5.9 and Axiom 5.4. 


5.2 Two Models 


At the beginning of this chapter, I noted that we will explore interpretability relations 
between theories. We will consider interpretations or translations that carry us from 
one theory to another. We can treat this as an entirely syntactic exercise, assigning 
sentences to sentences without ever indicating what those sentences mean. I hope 
it will not be too confusing if we follow established usage and (as in Sect.2.3) 
talk about “interpretations” of another sort, interpretations that offer readings of 
individual theories rather than mappings between theories. In such a reading of our 
set theory, we would (1) specify our UNIVERSE OF DISCOURSE (we would indicate 
what we are discussing when we say “every set’, “any sets,” “some set”, or “some 
sets’); (2) indicate which objects in our universe are MEMBERS of which objects; and 
(3) indicate which objects in our universe are the RANKS of which objects. Recall, 
from Chap. 2, that a MODEL of our set theory would be an interpretation that makes 
all our axioms true. At the moment, our only axioms are 5.1—5.5 (though we will 
add more in a bit). 

Here is a very simple interpretation. Let our universe of discourse consist of the 
lamp on my desk. When we say that all or some objects in the universe have a certain 
property, we will just mean that the lamp has that property. The lamp will count as 
a set, it will have no members, and its rank will be itself. 

If we eliminate defined vocabulary and translate things into a kind of logicianese, 
Axiom 5.1 reads as follows. 
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Every set x and every set y is such that [p(x) € p(y) if and only if some member z of y is 
such that [every member w of x is such that [p(w) € p(z)]]]. 


If that looks like gibberish, a more formal version may be helpful. 


VxVy((x isa set A yisaset) > (p(x) € p(y) @ Az € yYw E x p(w) € p(z))). 


According to the above interpretation, this says that my lamp is a member of my 
lamp if and only if some member of my lamp has a certain property that we need not 
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analyze further. Since my lamp has no members, both sides of this biconditional are 
false and, hence, the biconditional itself is true. So our interpretation makes Axiom 
5.1 true. 


Exercise 5.10 Show that our interpretation makes Axioms 5.2-5.5 true. 


You have confirmed that our interpretation is a model of our set theory: we can 
make all our axioms true in a universe that has only one thing in it. So our theory, 
though certainly consistent, is not yet of much mathematical use: it does not assert 
the existence of things that behave like the natural numbers or the real numbers 
or other interesting objects of mathematical investigation. In the next section, we 
shall consider how to strengthen our theory. First, however, we consider another 
interpretation of our theory. 

Once again, we let my lamp be the only object in our universe and we let my 
lamp’s rank be my lamp. This time, though, we give my lamp a member: itself. Let 
A be my lamp. Then we have assumed the following: A € A and p(A) = 4. Since 
p(A) € pA), A is of bounded rank. So Axiom 5.2 says our universe must feature a 
set {A} whose only member is 2. Indeed it does, because A = {A}. 


Exercise 5.11 Confirm that our new interpretation makes Axioms 5.1, 5.2, 5.4, and 
5.5 true, but makes Axiom 5.3 false. 


You have just shown that Axiom 5.3 is INDEPENDENT: it does not follow from the 
other axioms of our theory. If it did follow, it would not be possible for it to be false 
while the other axioms are true. But you just showed that this is possible. 


5.3 How Many Ranks? 


Axiom 5.2 is only too happy to give us lots of sets as long as we hold up our end: 
we need to be good at showing that sets are of bounded rank. It will be easier for 
us to do that if there are lots of ranks. For example, if there is a non-empty rank, it 
will outrank %. Since % will then be of bounded rank, there will be a set whose only 
member is 4. That is, Axiom 5.2 will give us {%}. If there are two non-empty ranks, 
the larger of them will outrank both % and {G} and Axiom 5.2 will give us {@, {O}} 
and {{@}}. Here is an axiom that gives us infinitely many ranks. 


Axiom 5.6 Every rank precedes some rank. 


I am not saying that some rank is preceded by every rank. (It would then have to 
precede itself.) The point is, on the contrary, that there is no maximum rank. Given 
any rank q, there is a rank preceded by a. 


Exercise 5.12 Show that each set is outranked by some set. 


3 A biconditional is an “if and only if” statement. In classical logic, a biconditional "? <> y7 is 
true when ¢ and vy are both false (or both true). 
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Exercise 5.13 Show that any sets x, y will form a set {x, y}. 
Exercise 5.14 Show that any members of a set will form a set. 


Exercise 5.15 Show that if x is a set, then the members of x’s members form a 


set Jx 
Exercise 5.16 Show that the subsets of any set x forma set Px. 


You have just derived four of the axioms of Zermelo’s set theory Z (mentioned in 
Chaps. 3 and 4). Our Axiom 5.4 is another Z axiom. 


Theorem 5.10 The members of any sets x, y forma set x Uy. 


Proof Note that x U y = U{x, y}. 


Definition 5.9 a+ 1=aU {a}. 
Exercise 5.17 Show that a + 1 is the first rank preceded by a. 


Note that the sets outranked by @ are of bounded rank and, so, form a set. 


Definition 5.10 V(@) = @. If a outranks any sets, then the members of V(q@) are 
the sets outranked by a. 


That is, x € V(q@) if and only if p(x) € a. Sox € V(p(y)) if and only if y outranks 
x. 


Exercise 5.18 Show that V (a) is transitive. 
Exercise 5.19 Show that V(a + 1) C PV(q). 


Exercise 5.20 Show that PV(a) C V(a + 1). Ut might help if you suppose a = 
p(Z).) 


Theorem 5.11 V(a+ 1) = PV(q). 

Exercise 5.21 Show that if a € B, then V(a) C V(B). 
Exercise 5.22 Show that if a € B, then V(a) € V(). 
Theorem 5.12 p(x) is the first a such that x C V(a). 


Proof By Exercises 5.2 and 5.5, y € x only if p(y) € e(e(x)). So p(x) outranks 
every member of x and, hence, x C V(p(x)). Suppose x C V(q). Then @ outranks 
every member of x. So, by Theorem 5.6, x does not outrank w and, hence, by Theorem 
5.7, p(x) € a or p(x) = a, as desired. 


Theorem 5.13 p(x) + 1 is the first aw such that x € V(a). 
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Proof Theorems 5.11 and 5.12 let us reason as follows: 

x © V(p(x)) => x € PV(p(x)) => x € V(p(x) + 1). 
According to Exercise 5.17, o(x) + | is the first rank preceded by p(x). So p(x) € a 


only ifa ¢ p(x) + 1. But x € V(q@) only if p(x) € a. Sox € V(q) only if 
a ¢ p(x) + 1. That is, p(x) + 1 is minimal. 


Exercise 5.18 and Theorem 5.13 yield the following. 


Theorem 5.14 Every set is a member of a transitive set. 
Exercise 5.23 Show that p(V(a)) = a. 


Exercise 5.24 Show that V(a) € V(B) only ifa € B. 


5.4 Equiconsistency 


We turn now to the task of showing that our new set theory is interpretable in our old 
list theory—and vice versa. Here our interest is inter-theoretic interpretation. We will 
consider translations that carry us from one theory to another. Recall our definition 
of ‘Frege-precursor’ from Chap.3: x is a Frege-precursor of y if and only if, given 
any entry-closed lists, if y is one of those lists, then so is x. We now translate this 
definition from the language of list theory into the language of set theory. 


Definition 5.11 Some sets are, collectively, MEMBER- CLOSED if and only if each 
member of one of them is itself one of them. 


Definition 5.12 x is a FREGE- PRECURSOR of y if and only if, given any member- 
closed sets, if y is one of those sets, then so is x. 


We will now consider a new definition of ‘precursor’ and will show that the 
precursors of a set are exactly its Frege-precursors. 


Definition 5.13 x is a PRECURSOR of y if and only if x is amember of every transitive 
set that has y as a member. 


Suppose x is a precursor of y. We want to show that x is also a Frege-precursor 
of y. Pick some sets that are, collectively, member-closed. Call them “the M-sets”. 
Then every member of an M-set is an M-set. Suppose y is an M-set. We want to 
show that x is an M-set too. Consider the M-sets that do not outrank y. Since these 
sets are of bounded rank, they form a set z. 
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Exercise 5.25 Show that the set z described above is transitive. 


Since y does not outrank itself, it is a member of z. So, by Definition 5.13, x is a 
member of z and, hence, x is an M-set. We conclude that x will appear among some 
member-closed sets whenever y does. That is, x is a Frege-precursor of y. More 
generally, every precursor of a set is a Frege-precursor. We now want to prove the 
converse. 

Suppose x is a Frege-precursor of y. Suppose y is a member of the transitive set 
z. Then z’s members, collectively, have the property that every member of one of 
them is itself one of them. That is, z’s member are, collectively, member-closed. So, 
by Definition 5.12, x is one of z’s members. We conclude that x will be a member 
of a transitive set whenever y is. That is, x is a precursor of y. 

Set precursors are just like list precursors: the precursors of a set y are what tireless 
immortal beings encounter when they start with y and persist in tracking down the 
members of everything they encounter. Precursors of y are y itself, the members of 
y, the members of y’s members, and so on. 


Exercise 5.26 Show that the precursors of a set x forma set Sx . (You might start 
by confirming that each precursor of x is amember of V(p(x) + 1)). 


Exercise 5.27 Show that if y is transitive, then Sy = y U{y}. (You might first show 
that y U {y} is transitive.) 


Note that, in particular, Sa =a + 1. 
Theorem 5.15 Any sets of precursors of a set will form a set. 


Proof Any set of precursors of a set x will be a subset of Sx and, hence, a member 
of P Sx. Now apply Exercise 5.14. 


When we translate Axioms 3.1—3.3 from the language of list theory into the 
language of set theory we obtain Axiom 5.3, Axiom 5.4, and Theorem 5.15. If @ is 
a sentence in the language of list theory that follows from Axioms 3.1—3.3, then the 
translation of ¢ into the language of set theory will follow from Axiom 5.3, Axiom 
5.4, and Theorem 5.15 and, hence, will follow from Axioms 5.1—5.6 (since Theorem 
5.15 follows from Axioms 5.1—5.6). So every result we obtained from Axioms 3.1— 
3.3 becomes, when suitably translated, a theorem of our set theory and we can treat 
it as such without bothering to give a whole new proof. 

Suppose Axioms 3.1—3.3 are inconsistent. Then every sentence in the language 
of list theory follows from them. This would include the absurdity: 


Some list both is and is not an entry of itself. 
Our translation of this sentence into the language of set theory is: 


Some set both is and is not a member of itself. 
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Since this sentence is absurd, we conclude that Axioms 3.1—3.3 are inconsistent 
only if Axioms 5.1-5.6 are. So any evidence for the consistency of Axioms 
5.1-5.6 is evidence for the consistency of Axioms 3.1—3.3. We say Axioms 3.1— 
3.3 are consistent “relative to” Axioms 5.1—5.6. We also say that Axioms 3.1—3.3 are 
INTERPRETABLE IN Axioms 5.1—5.6. 

We have just seen how to interpret the list-theoretic claims of Axioms 3.1—3.3 as 
set-theoretic claims that follow from Axioms 5.1—5.6. We now want to reverse course 
and show how to interpret the set-theoretic claims of Axioms 5.1—5.6 as list-theoretic 
claims that follow from Axioms 3.1—3.3. We read “set” as “hereditarily finite list” 
and “member of” as “entry of”. We use Definition 3.13 to interpret the notion of rank. 
Theorem 3.19, then, is our translation of Axiom 5.1. Exercise 3.43 is our translation 
of Axiom 5.2. Axiom 5.3 presents no difficulties: the blank list is a hereditarily finite 
list with no hereditarily finite entries. Exercise 3.32 is our list-theoretic version of 
Axiom 5.4. Our list-theoretic interpretation of “rank” makes Axioms 5.5 and 5.6 
true because every number is a list of numbers (Exercise 4.5) and every number is an 
entry of a number. So we have shown how to make Axioms 5.1—5.6 come out true in 
the universe of hereditarily finite lists. If @ is a sentence in the language of set theory 
that follows from Axioms 5.1—5.6, then the translation of ¢ into the language of list 
theory will follow from Axioms 3.1—3.3. So every result we obtained from Axioms 
5.1-5.6 becomes, when suitably translated, a theorem of our list theory and we can 
treat it as such without bothering to give a whole new proof. 

If it follows from Axioms 5.1—5.6 that every set both is and is not a member 
of itself, then it will follow from Axioms 3.1—3.3 that every hereditarily finite list 
both is and is not an entry of itself. Axioms 5.1—5.6 are inconsistent only if Axioms 
3.1-3.3 are. So any evidence for the consistency of Axioms 3.1—3.3 is evidence for 
the consistency of Axioms 5.1—5.6. Having proved RELATIVE CONSISTENCY in both 
directions, we say that Axioms 3.1—3.3 and Axioms 5.1—5.6 are EQUICONSISTENT. 
We have shown: Axioms 3.1—3.3 are consistent if and only Axioms 5.1—5.6 are. 


5.5 Even More Ranks 


Our first interpretability result from the previous section allows us to exploit the 
number theoretic results of Chap. 3. In particular, we can assume that the numbers 


0, SO, SSO,... 
that is, 


O, {DO}, (O, {O}}, ... 
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are well-defined and obey an induction principle. Given induction, Exercises 5.4, 
5.17, and 5.27 imply that each number is a rank. What about the converse? Is every 
rank a number? Suppose some rank is not and use Theorem 5.5 to let £ be the first 
such rank. This means # is a set of numbers. Since % is a number, but f is not, 6 
is non-empty. Exercises 5.4 and 5.7 imply that % is a member of every non-empty 
rank. So @ € 6. Suppose the number n is a member of , but Sn is not. Since B is 
transitive, Sn cannot be a member of any member of 6. Given any two numbers, one 
will be a member of the other. So every member of f is a member of Sn. On the other 
hand, the members of Sn are n and the members of n and these are all members of 
8. But this means 6 = Sn, contrary to our assumption that 6 is not a number. We 
conclude that Sn is a member of 6 whenever n is and, more generally, that every 
number is a member of 6. A rank that is not a number will have every number as a 
member. 

We are going to translate this result into the language of list theory so we can apply 
the second interpretability result from the previous section. To be sure we know what 
we are translating, let us express our result more formally. On the assumption that 
some rank x is not anumber, we have found that x will, first of all, have the following 


property. 
dy(yexAVzz¢€y) 


Our translation of this will say that x is a hereditarily finite list with a hereditarily 
finite entry y that has no hereditarily finite entries. Since super-numbers list the 
entries of their entries, entries of hereditarily finite lists are themselves hereditarily 
finite. So y will have no entries at all and our translation will say that x lists the blank 
list. Returning to the world of sets, the set x (the rank that is not a number) will also 
have the following property. 


Vw(w € x > wU {w} € x) 


In the language of set theory, the term ‘w U {w}?’ refers to the set whose members 
are w and the members of w. If w were a hereditarily finite list, Exercises 3.28 and 
3.30 would assure us that there is a hereditarily finite list whose entries are w and 
the entries of w. When we do our translation, we can let ‘w U {w}’ refer to that list. 
So our translation says there is a hereditarily finite list x that lists J and lists w U {w} 
whenever it lists w. We can use induction and Theorem 3.7 to show that such an x 
will list every number. 

If Axioms 5.1—5.6 imply that some rank is not a number, then, by the second 
interpretability result from the previous section, Axioms 3.1—3.3 imply that some 
hereditarily finite list lists every number. Since this contradicts Theorem 3.17, it 
would mean that Axioms 3.1—3.3 are inconsistent. If you believe these axioms are 
consistent, you can think of our argument as a reductio ad absurdum. Here is the 
sequence of entailments, starting with the assumption to be reduced to absurdity. 
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Axioms 5.1—5.6 prove that some rank is not a number 


J 


Axioms 5.1—5.6 prove that the numbers form a set 


J 


Axioms 3.1—3.3 prove that the numbers form a hereditarily finite list 


mn 


Axioms 3.1—3.3 are inconsistent. 


We conclude: if Axioms 3.1—3.3 are consistent (or, equivalently, if Axioms 5.1—5.6 
are consistent), then Axioms 5.1—5.6 do not imply that there are any ranks other than 
the numbers %, SO, SSG, .... If we want a rank that is not a number, we will have to 
adopt a new axiom. We are going to adopt an axiom that gives a rank o that has all 
the numbers as its members. 


Definition 5.14 a IMMEDIATELY PRECEDES y if and only if a € y but for no 6 do 
we have a € § € y. A rank that precedes y is said to be a PREDECESSOR of y. A 
rank that immediately precedes y is said to be y’s IMMEDIATE PREDECESSOR. 


Exercise 5.28 Confirm that a rank can have at most one immediate predecessor. 
(Note, too, that we already know of a rank with no immediate predecessor.) 


Exercise 5.29 Confirm that every number other than % has an immediate 
predecessor. 


Axiom 5.7 There is a non-empty rank with no immediate predecessor. 


Since J is the only number with no immediate predecessor, Axiom 5.7 implies 
that some rank is not a number and, so, is independent of the other set axioms (if our 
first three list axioms are consistent). 


Definition 5.15 Let w be the first rank that satisfies Axiom 5.7 
Theorem 5.16 Every number is a member of w. 


Proof Exercise 5.29 implies that w is not a number. Now just apply the argument I 
gave a few paragraphs ago about ranks that are not numbers. 


Theorem 5.17 Every member of w is a number. 


Proof Let 6 be the first member of w that is not a number. Then every member of 
B is a number. Furthermore, since 6 is a rank that is not a number, every number 
is a member of 6. Given any predecessor n of 6, we will have: n € Sn € B. So B 
is a non-empty rank with no immediate predecessor, contradicting the minimality 
of w. 
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The members of w are the numbers 0, SZ, SS@, .... According to Definition 5.10, 
the members of V(w) are the sets outranked by w. So we have: 


xeV(@) = p(x) €o <=> p(x) is a number. 


Just as in the world of lists, something is a number if and only if it is an entry or, as 
we now say, a member of a number. So: 


p(x) is anumber <=> p(x) € 7 for some number n. 


That is, o(x) is anumber if and only if x is outranked by a number. Since the members 
of V(n) are the sets outranked by n, we conclude: 


x €V(@) => x € V(n) for some number n. 
By Definition 5.10, Theorem 5.11, and Exercises 5.8 and 5.27, 
V0) =9 
V(Sn) = PV(n). 


So, just as in Chap.3, we can show that the sets V(n) are the super-numbers 
%, PY, PP@,.... If we say that a hereditarily finite set is any member of a super- 
number, the members of V(@) are the hereditarily finite sets. This confirms the 
following theorem. 


Theorem 5.18 The hereditarily finite sets form a set. 


For brevity’s sake, we say that Axioms 3.1, 3.2, 3.3, and 4.1 are “the list axioms”, 
while Axioms 5.1—5.7 are “the set axioms”. We have just confirmed that the trans- 
lation of Axiom 4.1 is derivable from the set axioms. We already know that the 
translations of Axioms 3.1—3.3 are derivable. So, if is a sentence in the language 
of list theory that follows from the list axioms, the translation of ¢ into the language 
of set theory will follow from the set axioms. The list axioms, then, are inconsistent 
only if the set axioms are. Any evidence for the consistency of the set axioms is 
evidence for the consistency of the list axioms. 

Results from Chap. 4 let us establish the converse. To translate statements about 
sets into statements about lists, we read “‘set” as ““Zermelian list” and ‘““member of” as 
“entry of”. Theorem 4.2 translates Axiom 5.1. Exercises 4.20—4.23 translate Axioms 
5.2-5.5. As for Axioms 5.6 and 5.7, note two things. First, every rank is an entry of 
a rank. Second, @ is a rank with entries and each entry of @ is listed by an entry of 
a. We conclude: if @ is a sentence in the language of set theory that follows from 
the set axioms, then the translation of ¢ into the language of list theory will follow 
from the list axioms. The set axioms are inconsistent only if the list axioms are. 
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Any evidence for the consistency of the list axioms is evidence for the consistency 
of the set axioms. Indeed, given our earlier result, we see that the list axioms and 
the set axioms are equiconsistent: the list axioms are consistent if and only the set 
axioms are. 


5.6 V(@ + w) and Beyond 


We have seen that our set axioms all come out true in the universe of Zermelian lists. 
All the mathematics you are likely to encounter in an undergraduate mathematics 
course can be reconstructed within this universe. If, however, you want an even more 
robust set theory, a set theory that makes demands on its models not satisfied by the 
Zermelian universe, you should consider adding axioms that give you more ranks. 
As we discussed in Sect.5.3, ranks are the fuel that drives the production of sets. 

What is missing from the universe of Zermelian lists? w + @, if it existed, would 
list all the entries of all the higher ranks 


o+0,@a+1,@+2,... 
just as @ lists all the entries of all the numbers 
0) 12s ees 


@ + @ would, in fact, be the list of all Zermelian ranks. As a transitive set of ranks, 
@ +o would, by Theorem 5.9, itself be a rank. So, if @ + w were Zermelian, it 
would list itself. But this cannot happen, because we know Theorem 5.3 is true in the 
universe of Zermelian lists.t So w + w is missing from the Zermelian universe and it 
does not follow from our set axioms that there are any ranks other than the numbers 
0, SO, SSO, ... and the higher ranks w, Sw, SSw,.. a Summarizing the argument: 


PREMISE: Our set axioms are all true in the universe of Zermelian lists. 

e PREMISE: Anything implied by our set axioms will be true wherever those axioms 
themselves are true. 

PREMISE: In the universe of Zermelian lists, there is no list of all Zermelian ranks. 
e CONCLUSION: Our set axioms do not imply that there is a set of such ranks. 


4 Similar reasoning yields the Burali-Forti paradox in some versions of set theory. See Burali-Forti 
[1]; English translation in van Heijenoort [10], pp. 104-111. 


5 OK, there is another alternative. Maybe our set axioms are inconsistent and the whole notion of 
a universe of Zermelian lists is fundamentally incoherent. A good reason to believe this wrong is 
that some profound minds have thought deeply about the Zermelian universe and have detected no 
incoherence. A prominent mathematician who thought he did detect incoherence was HERMANN 
WEYL (1885-1955), who believed classical set theory was “permeated by the poison of contradic- 
tion”. (See Weyl [11], p. 23; English translation in Weyl [12], p. 32.) For an argument that Weyl 
should be taken seriously in general, but not in this case, see Pollard [8]. 
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This means it would strengthen our set theory to suppose that, say, there is more than 
one non-empty rank with no immediate predecessor. This would give us w + o. It 
would also give us a set V(@ + w) in which our set axioms all come out true: what 
had served as the universe of all sets would become a set. 


Exercise 5.30 Consider the set theory consisting of Axioms 5.1—5.6 and a version 
of Axiom 5.7 asserting that there are at least two non-empty ranks with no immediate 
predecessors. Describe a rank whose existence does not follow from these axioms. 


The preceding exercise might leave us wishing for a less piecemeal approach. 
No sooner do we escape the Zermelian universe than we discover a rank missing 
from our new universe. Could we manage a more dramatic gesture that gives us 
lots of ranks all at once and doesn’t leave any gaps that are too obvious? Here is an 
idea. If it existed, the rank w+ w would be well-ordered by the membership relation. 
Recall, from Sect. 3.6, that a well-ordering is a transitive, irreflexive, connected, well- 
founded relation. The “less than” relation on the natural numbers is an example. It 
is transitive 

m<n<p onlyif m<p 


irreflexive 
mAm 


and connected 
m=n or m<n or n<m. 


As for well-foundedness, if you pick any natural numbers, one of them will be less 
than all the others. (No matter what numbers you pick, there will be a “bottom” one.) 
The membership relation has these same properties inside of w and would have them 
inside of w + w. Well-ordered by membership, w + w would look like two copies of 
@ (two copies of the natural numbers) placed one after another. The numbers 


0, SO, SSO,... 
would form the first copy of w, while the higher ranks 
o, Sw, SSo,... 


would form the second. Now, whether or not w + @ exists, we can re-arrange the 
natural numbers themselves so that they too look like two copies of w. For example, 
we can let the even numbers 0, 2,4, ... be the first copy of w, to be followed by the 
odd numbers 1, 3, 5, ... forming the second copy. The natural numbers would then 
be arranged in the same way as the Zermelian ranks (the numbers n plus the higher 
ranks w + n). This suggests the following association of numbers with ranks. 


fQn) =n 
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fQn+1)=a+n. 


The function f associates numbers with ranks as follows. 


2 Aes. 1 3 5 
+ J 1 1 1 
1 2...@+0@+1o42... 


o<— © 


Or, if you prefer: 


12 3 4 5 


0 
4+ ++ + + + 4 
0a+01a0+12 042... 


Depending on how you look at it, the function f shows us how to make the natural 
numbers look like the Zermelian ranks or it shows us how to make the Zermelian 
ranks look like the natural numbers. 

How will this help us to get lots of new ranks? Here is a proposition that would 
allow us to use functions like f to generate ranks. 


Proposition 5.1 Whenever there is a function that takes us from the members of a 
set to some ranks, those ranks themselves form a set. 


Our function f takes us from the members of the set w to the Zermelian ranks. 
So Proposition 5.1 would give us a set whose members are the Zermelian ranks. 
Proposition 5.1 would give us w + w. Here is another way of thinking about it: if 
the members of a set, ordered by some relation, exhibit the same structure as some 
ranks, ordered by membership, then those ranks form a set. Since we can make the 
members of w look like two copies of w and the membership relation makes the 
Zermelian ranks look like two copies of w, those ranks form a set. Alternatively, if 
you can make some ranks look like the members of a set, ordered by some relation, 
then, again, those ranks form a set. Since we can rearrange the Zermelian ranks so 
that they look like the members of the set w, the Zermelian ranks form a set. 


Exercise 5.31 [fw-2 (“omega times two”) looks like two copies of w and w-n looks 
like n copies of w, what does w* (“omega squared”) look like? Use Proposition 5.1 
to show that w* exists. 


Here is one way to motivate Proposition 5.1. If we abstract from everything but 
order, then the even natural numbers in their standard order look just like all the 
natural numbers in their standard order. 


02468... 
01234... 
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We might say that these two structures are instances or tokens of the same type, the 
same ORDER TYPE. Each type of well-ordering is known as an ORDINAL. Now if we 
start talking freely about ordinals, someone might reasonably ask us whether these 
ordinals are supposed to be objects in the universe of sets and, if so, what objects. 
Structures that look like the members of our rank w, structures that look like the 
natural numbers in their standard order, are traditionally said to be of order type w: 
their ordinal is w. Perhaps we should identify the ordinal w with our rank w. Perhaps 
we should, quite generally, identify ordinals with ranks. In the absence of Proposition 
5.1, this would lead immediately to problems. We want every well-ordering to be a 
well-ordering of some type. We want there to be an ordinal for every well-ordering. 
The Zermelian universe V(@ + w) features well-ordered structures of type w + w 
but no rank w + w. So our set axioms will leave us unable to prove that every well- 
ordering has an ordinal if we identify ordinals with ranks. Proposition 5.1 is a way 
of responding to this problem. Given Proposition 5.1, w + @ exists because it is the 
order type of a well-ordered set—that is, because we can make the members of the 
set w look like the Zermelian ranks.° 

Chapters 3-5 have given you three big doses of set theory. You ought to know, 
however, that you have only just begun to learn about a majestic, audacious mathe- 
matical construction that ignited the imagination of some of the most profound minds 
of the last century. You should explore further on your own.’ Our next two chapters, 
however, are going to stray from the set theoretic mainstream. We begin with Frege’s 
uncanny demonstration that arithmetic is a stew consisting almost entirely of ideas 
and methods drawn from pure logic. 


5.7 Solutions of Odd-Numbered Exercises 


5.1 % outranks a set x if and only if some member of % outranks each member of 
x. But J has no members. A set x will outrank @ if and only if some member of x 
outranks each member of @. If x 4 @, then, by Axiom 5.4, x has members. Any 
member of x will outrank each member of /—again, because # has no members. 


5.3 By Exercise 5.1, @ does not outrank itself (because it outranks nothing). Suppose 
no member of x outranks itself. Then no member of x outranks each member of x. 
So, by Axiom 5.1, x does not outrank itself. 


5.5 By Definition5.6, p(x) is a rank. Theorem 5.8 applies to every rank a. So 
P(p(x)) = pe). 


© Proposition 5.1 is a version of the Replacement Axiom of ABRAHAM FRAENKEL (1891-1965) 
and THORALF SKOLEM (1887-1963). See Fraenkel [2], p. 231, and van Heijenoort [10], p. 297. For 
an especially clear demonstration that Replacement supplies every well-ordered set with an ordinal, 
see Kunen [5], p. 17. For an overview of the history, see Kanamori [4]. 


7 You might start with Kunen [5] for the mathematics and Hallett [3] for the history. 
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5.7If 6 © a C 6, then B € 6B—contrary to Theorem 5.3. So 6 ¢ aw and, sincea # B, 
Theorem 5.7 guarantees that a € f. 


5.9 Suppose x is a set of ranks anda € x. By Exercise 5.2, o(a@) € p(x) and, hence, 
by Theorem 5.8, a € p(x). We conclude that x C p(x). 


5.11 Axiom 5.1 says that my lamp outranks my lamp if and only if some member 
of my lamp outranks each member of my lamp. Both halves of the biconditional are 
true: my lamp does outrank my lamp and, furthermore, my lamp’s only member out- 
ranks my lamp’s only member (since my lamp’s only member is my lamp). Axiom 
5.2 says: if my lamp is of bounded rank, then there is a set whose only member is 
my lamp. Well, my lamp is of bounded rank and is itself a set whose only member 
is my lamp. Axiom 5.3 is false because the only set in the universe (my lamp) has a 
member (itself). Axiom 5.4 is true because there is only one set in the universe and, 
in the absence of distinct sets, there cannot be distinct sets with the same members. 
Axiom 5.5 is true because our one rank has itself as its only member. 


5.13 Pick sets x, y. Theorem 5.7 lets us assume that p(x) either precedes p(y) or 
is p(y). Axiom 5.6 lets us pick a rank a preceded by p(y). By Theorem 5.4, p(x) 
precedes a. So the sets x, y are collectively of bounded rank and, hence, by Axiom 
5.2, form a set. 


5.15 Suppose z € y € x. By Exercise 5.2 and Theorem 5.2, x outranks z. We con- 
clude: the members of x’s members are of bounded rank and, so, we can apply Axiom 
5.2. 


5.17 Suppose 6 € (a U {a}). Then either 6 € a or 8 = a. By Theorems 5.3 and 5.4, 
a ¢ B. We conclude: if a precedes 6, then 6 does not precede a + 1. Butisa+la 
rank? According to Theorem 5.9, the answer will be “yes” if a U {a} is a transitive 
set of ranks. It is a set of ranks because a is both a rank and (Axiom 5.5) a set of 
ranks. As for transitivity, suppose y € B € (a U {a}). Case |: 6B € a. By Theorem 
5.4, y € a. Case 2: B = a. Again, y € a. So, in each case, y € (a U {a}). 


5.19 Suppose x € V(a + 1). Then p(x) € (a + 1) and, hence, either p(x) € @ or 
p(x) = a. We want to show that x C V(q) (since it then follows that x € PV(q@)). 
Suppose y € x. By Exercise 5.2, p(y) € p(x) and, hence, by Theorem 5.4, p(y) € a. 
That is, y € V(q@). 


5.21 Theorem 5.4 and Definition 5.10 let us reason as follows 
xe Via) p(Qx)eae B= px)e B= x € VOB). 
5.23 According to Theorem 5.12, o(V (@)) is the first rank 8 such that V(a) C V(B). 


So, since V(a) C V(a), a ¢ p(V(a@)). On the other hand, suppose p(V(@)) € a. 
Then, by Exercise 5.22, V(p(V(a@))) € V(a@). But, by Theorem 5.12, Via) C 
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V(p(V(a))). So V(p(V(@))) € V(e(V(@)))—contrary to Exercises 5.2 and 5.3. 
Now apply Theorem 5.7. 


5.25 Suppose v € w € z. Since w is an M-set, so is v. Since w outranks v but does 
not outrank y, v does not outrank y. So v € z. 


5.27 Suppose y is transitive. We want to show that y U {y} is transitive. Suppose 
w €x € (y U {y}). Then either x € y or x = y. In each case, w € y and, hence, 
w € (y U {y}). Since y U {y} is transitive and y € (y U {y}), every precursor of y is 
a member of y U {y}. That is, Sy C (y U {y}). On the other hand, (y U {y}) C Sy 
since y and all its members are precursors of y. Now apply Axiom 5.4. 


5.29 Suppose n € k € Sn. By Exercise 5.27, eithern € k = norn ek € n— 
contrary to Theorems 5.3 and 5.4. So the immediate predecessor of Sn is n. Every 
number is either J or a successor. So every non-zero number has an immediate pre- 
decessor. 


5.31 w looks like w copies of w. Here is one way to rearrange the members of w\{0} 
to look like that. Let the odd numbers be the first copy of w. Multiply each odd 
number by 2 to get the second copy. Multiply those numbers by 2 to get the third 
copy. And so on. We end up with the following structure. 


13°) 7. 9 11... 
2 6 10 14 18 22... 
4 12 20 28 36 44... 
8 24 40 56 72 88... 


The nth copy of w is the result of multiplying each odd number by 2”~!. The factor- 
ization (2m — 1) - 2”—! tells us to look in column m on row n. For example, 


$6 = 92? = (225—1)-27", 


So 36 appears in column 5 on row 3. Since every positive integer is the product of an 
odd number and a power of 2, we are not leaving out any members of w\{0}. Here is 
another way to think about it. Write down the odd numbers and the powers of 2 first. 


357911... 


1 
2 
4 
8 


Treat the odd numbers 3, 5, 7, 9, 11, ...as column labels and the powers 2, 4, 8, ... 
as row labels. Fill in the table by putting 7 - & in column j on row k. 
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Chapter 6 
Frege Arithmetic 


6.1 The Language of FA 


If you ask to see a formalization of arithmetic, a mathematician who knows about 
such things will probably point you toward Peano Arithmetic. We will now consider 
an alternative formalization of number theory derived from the work of Gottlob 
Frege. We will also consider an unsuccessful Fregean approach to the foundations 
of set theory. As with our treatment of PA in Chap. 2, we will not present a formal 
logic for our new version of number theory. Our proofs will be informal. Be assured, 
however, that Frege himself provided a fully formalized version of the underlying 
logic.! 
The language of FREGE ARITHMETIC (FA) has the following vocabulary. 


1. Two CONNECTIVES: ‘—’ (“not”), ‘=’ (if ...then’). 

2. A QUANTIFIER: ‘V’ (“for all’’). 

3. The IDENTITY symbol: ‘=’. 

4. One FUNCTION symbol: ‘#’ (“number of”). 

5. Infinitely many OBJECT VARIABLES: ‘w’, ‘x’, ‘y’, ‘z’, ‘wi’, ‘x1’, ‘yi’, “Z17, “w2’, 
x2", “Yo", °Z2", os 

6. Infinitely many PROPERTY VARIABLES: ‘F’, ‘G’, ‘A’, ‘Fi’, ‘Gi’, ‘My’, ... 

7. Infinitely many RELATION VARIABLES: ‘P’, ‘Q’, ‘R’, ‘P,’,‘Q1’, ‘Ry’, ... 

8. Two PARENTHESES: ‘(’, ‘)’. 


The OBJECT TERMS of FA are the object variables and any expression '#/" | 
where J” is a property variable. The idea is that #F is the number of things that have 
property F. If F were the property of being a prefecture of Japan, then #F would 
be the number of Japanese prefectures: that is, #F would be 47. As the expression 


' Informality will not be our only departure from Frege. So be warned: this chapter does not pretend 
to offer a close reading of Frege’s own exposition. It draws freely on later reconstructions such as 
the one in George and Velleman [7]. For Frege’s own version, see [4] and [5]. For some helpful 
commentary, see Boolos and Heck [1]. 


S. Pollard, A Mathematical Prelude to the Philosophy of Mathematics, 123 
DOI: 10.1007/978-3-3 19-058 16-0_6, 
© Springer International Publishing Switzerland 2014 
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“object term” suggests, numbers are classified as objects. # is a function that assigns 
an object (a number) to each property. 
We recursively define the FORMULAS of FA as follows. 


9. If a and @ are object terms, then "a = (7 is a formula. 
10. If a is an object term and I is a property variable, then" "a! is a formula. 
11. If wand £ are object terms and © is a relation variable, then" Na" is a formula. 
12. If ¢ and w are formulas, then so are" —¢ |! and" (¢ > wW)". 
13. If wis a variable and ¢ is a formula, then "Vj: @ | is a formula. 


Property variables range over properties. ‘Fx’ says that object x has property F’. 
Relation variables range over binary (two-place) relations. “Rxy’ says that object x 
stands in relation R to object y. 

We define FREE and BOUND occurrences of variables just as we did in Chap. 2. 
A SENTENCE is a formula with no free occurrences of variables. We now stipulate: 


(WAX) —= -WY> -x) 
Wwoex) = WrwNAK> Y)) 
WVx) — -Yv> x) 
dd => -Vu-¢@. 
“A” represents CONJUNCTION (“and”), ‘<>’ represents the BICONDITIONAL (“if and 


only if”), while ‘Vv’ represents DISJUNCTION (“or’’). ‘a’ is the EXISTENTIAL QUAN- 
TIFIER (“there is’). 


Exercise 6.1 Translate the following sentences of FA into English. 
VEVG(Vx(Fx <> Gx) > #F = #G). 
VEVG(#F = #G > Vx(Fx — Gx)). 
VFAG(—Vx(Fx <> Gx) A#F = #G). 
AFVG(#F = #G > —Ax Gx). 
VR(AxVy Rxy > Vydx Rxy). 


We use recursion to introduce NUMERICAL QUANTIFIERS "4, | (“there are exactly 
n”) as follows. 


Definition 6.1 


dox ox = > —dx ox. 
Antix Ox <=> Ax(ox Adny(dy Ax # y)). 
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“dox Fx’ says no object has property F: —dx Fx. In English: “It is false that there 
is an object with property F”’. “d;x Fx’ says exactly one object has property F: 


dx(Fx A doy(Fy Ax # y)). 


In English: “There is an object x that has property F and no object other than x has 
property F’’. ‘dox Fx’ says exactly two objects have property F: 


Ax(Fx Adiy(Fy Ax 4 y)). 


In English: “There is an object x that has property F and exactly one object other 
than x has property F”’. And so on. 
The following definition will be particularly useful. 


Definition 6.2 
VFIVG(F & G @ AR(Vx(Fx > Aly(RxyAGy))AVy(Gy > Ayx(RxyA Fx)))). 


Suppose F & G and let R be a relation of the sort required by the definition. 
Say that the objects with properties F and G are, respectively, the F-objects and 
the G-objects. Then R assigns to each F-object exactly one G-object. Furthermore, 
each G-object is assigned by R to exactly one F’-object. So R PAIRS F-objects with 
G-objects. We also say that R is a PAIRING. 

Suppose, for example, that F is the property of being a New England state capital, 
while G is the property of being a New England state. Let R be a relation that holds 
between a city and a state when the former is the capital of the latter. Then R assigns 
to each New England state capital exactly one New England state, while each New 
England state is assigned by R to exactly one New England state capital. Indeed, R 
forms the following pairing. 


is the capital o : 
Hartford ——————->_ Connecticut 
is the capital o: ? 
Augusta ——————> Maine 
is the capital o: 
Boston _——————> Massachusetts 
is the capital o . 
Concord ———————> New Hampshire 
2 is the capital o 
Providence ——————> __ Rhode Island 
: is the capital o 
Montpelier ——————~> Vermont 


From the existence of such a pairing we can infer that the F-objects and the 
G-objects are EQUINUMEROUS, that is, the same in number. 
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6.2 The Axioms of FA 


We will now make the unsurprising assumption that the F'-objects and the G-objects 
are the same in number if and only if the number of F-objects is the same as the 
number of G-objects. This assumption is known as HUME’S PRINCIPLE.” 


VFVG(F © Go #F =4G). 


We also adopt infinitely many COMPREHENSION axioms for properties. These are 
the sentences of the form 
Wrharva(la <= ¢)" 


where a is an object variable, I" is a property variable, @ is a formula of FA with 
no free occurrences of J”, and Vj is a string of universal quantifiers that bind all the 
free occurrences of variables in the remainder of the sentence. Here is an example 
of a comprehension axiom: 


VGVy4dFVx(Fx = (GxAx #y)). 


This says: given any property G and object y, there is a property F that applies, 
exactly, to those G-objects that are not y. If G were the property of being a U.S. 
president and y were the current president, then you could think of F as the property 
of being a past president. 


Exercise 6.2 Can you think of a highly undesirable comprehension axiom we would 
obtain if we allowed I" to occur free in @? 


Exercise 6.3 The language of FA features a symbol ‘=’ expressing the relation 
of identity. Show that this symbol is eliminable by offering a definition of identity. 
Naturally, ‘=’ should not occur in your definition. You might recall from a logic class 
that two fundamental properties of identity are: (1) everything is identical to itselfand 
(2) identical things have the same properties. Verify that your definition is adequate 
by showing that it entails these two properties (Students of modern philosophy might 
find it helpful to review Leibniz.). 


Here is another one of our comprehension axioms: 
SFVx(Fx ox Ax). 


It says there is a property that applies to nothing. Let St be such a property. 


? In Sect. 63 of [3] (English translation in [6]), Frege quotes the following remark by DAVID HUME 
(1711-1776): “When two numbers are so combined as that one has always an unit answering to 
every unit of the other, we pronounce them equal”. 
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Exercise 6.4 Use comprehension to show that there is a property \ that applies 
to exactly one object (By the way, how many objects are we acquainted with at this 
point?). Use Hume’s principle to show that #)t is not the only number in the universe. 


Exercise 6.5 Use comprehension to show that there is a property D that applies to 
exactly two objects. Use Hume’s principle to show that #M and #5 are not the only 
numbers in the universe. How many numbers are there? 


We conclude this section by adopting infinitely many COMPREHENSION axioms 
for relations. These are the sentences of the form 


VPALVaVB(LaB < ¢)" 


where a, 3 are object variables, © is a relation variable, ¢ is a formula of FA with 
no free occurrences of &, and Vj is a string of universal quantifiers that bind all the 
free occurrences of variables in the remainder of the sentence. Here is an example 
of a comprehension axiom of this sort: 


ARVxVy(Rxy @ AF(Fx A Fy)). 


This sentence says there is a relation R that holds between any objects x and y that 
share at least one property. 


Exercise 6.6 Show that if R behaves in the way just described, then VxVy Rxy. 
Exercise 6.7 Show: VF (#F = #9t < dox Fx). 


Exercise 6.8 Show: VF (#F = #0 < A,x Fx). 


6.3 Some Number Theory 


Let 0 = #M where Sis, as in the previous section, a property that applies to nothing: 
Vx(Itx <a x Ax). 


This should sound reasonable: 0 is the number of objects that are not identical to 
themselves; 0 is the number of objects that have a property nothing could possibly 
have. This definition certainly seems less contrived than our stipulation in Chap. 3 
that 0 is the blank list. To express the whole business in the language of lists: it seems 
less contrived to say that 0 is the number of entries of the blank list than to stipulate 
that 0 is the blank list. 

Let 1 = #£ where Lis a property whose existence is guaranteed by the following 
comprehension axiom: 

SFVx(Fx ox =0). 
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That is, {is a property that applies only to 0: 
Vx (Lx <> x = 0). 


Since 0 is the only object that has property U, the number of L-objects is, indeed, 
one. | is the number of objects identical to 0. Again, this seems less contrived than 
our earlier stipulation that 1 is the list whose only entry is the blank list. 

We know, of course, that 1 is the first number after 0: that | immediately succeeds 0. 
We now want to give a formal definition of immediate succession using the language 
of FA. Since our definition will be a bit complicated, we are going to sneak up on 
it one little step at a time. We begin with a theorem of FA that may appear entirely 
pointless: 

0O=0AVY(y¥ A Yo (V=DAy #0). 


From this we infer the following: 
M0 A Vy(Nty > (Uy A y £ 0)). 


So: 
dx (La A Vy(Ny — (Ly A y #x))). 


Adding a bit more information about St and LU: 
O= FNMA L=FUA Ax AVI(My & Lb Ay #x))). 
We conclude: 
AFAGO=#F AL=#GA Ax(Gx AVy(Fy @ (GyA y #x)))). 


Since we derived this sentence from a theorem of FA, we know it too is a theorem 
of FA. Our theorem says there are properties F and G with the following charac- 
teristics: 0 is the number of F-objects, 1 is the number of G-objects, and, for some 
G-object x, the F-objects are exactly the G-objects distinct from x. That is, we can 
obtain the F'-objects by deleting one G-object or, in other words, the number of G- 
objects SUCCEEDS the number of F'-objects (I am going to save ink by leaving off the 
‘immediately’ in ‘immediately succeeds’.). In symbols: c# F#G (‘o’ is the Greek 
letter sigma.). Since #F = O and #G = 1, we have just confirmed that | succeeds 
0: c01. We have also provided an analysis of this relation of succession. We now 
enshrine this analysis in a general definition of succession. 


Definition 6.3 


VWwVz(owz @ AFAG(w = #F Az =#GAAX(Gx AVY(Fy © (GyA y €x))))). 
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z succeeds w if and only if w is the number of F-objects (for some property F’), z is 
the number of G-objects (for some property G), and we can obtain the F’-objects by 
deleting one of the G-objects. 


Exercise 6.9 Show: dow aw0. 
Theorem 6.1 VYwVz1Vz2((awz, A owz2) > Z| = Z2). 


Proof Let F,, G, be properties certifying that awz,, while Fy, G2 are properties 
certifying that owz2. That is, 


w=#F, Az = #G) A Ax (Gix1 AVY y = (GiyAy # x1))) 
and 
w= #F) A 22 = #G2 A Ax2(Gox2 AVy(Foy @ (Gay Ay F x2))). 


We need to show that #G1 = #G> and this, in turn, requires us to show that G; © Go. 
Note, first, that #F = #F> since both are identical to w. So Hume’s principle allows 
us to pick a relation R that pairs F\-objects with Fy-objects. Comprehension gives 
us another pairing P with the following useful feature: 


VxVy(Pxy (Fix A Foy A Rxy)). 


That is, P ignores everything but F;-objects and F-objects, pairing the former with 
the latter. Now pick a Gy -object x; and a G2-object x2 that behave as indicated 
above. That is, first, the F)-objects are exactly the G;-objects distinct from x; and, 
second, the F2-objects are exactly the G2-objects distinct from x2. We will use these 
materials to identify a relation that pairs G1-objects and G2-objects. Comprehension 
provides a relation Q that behaves as follows: 


VxWVy(Qxy <> (Pxy Vv (x = x1 A y = x2))). 


We want to show that Q is the desired pairing. Suppose x is a Gy-object. If x = x1, 
then Qxx2. Suppose x # x,. Then x is an F}-object and, hence, we can pick an 
F-object y such that Pxy. So Qxy. Since every F2-object is a Gz-object, y is a 
G2-object. We conclude that Q assigns a G2-object to every G1-object. We still need 
to show that this assignment is unique. Suppose: 


Gx, Oxz, Oxz’. 


Case 1: x = x;. Then x is not an F;-object and, hence, it is not the case that Pxy 
for any object y. So Qxy only if y = x2. This means that z and z’ are the same 
G2-object, namely, x2. Case 2: x # x1. Then Pxz and Pxz’. So z and z’ are the same 
F,-object and, hence, are the same G2-object. We conclude: given any G1-object x, 
there is a unique G2-object y such that Oxy. Similar reasoning shows: given any 
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G2-object y, there is a unique G1-object x such that Oxy. We conclude: G; © Go. 
So, by Hume’s principle, #G; = #G2. 


Theorem 6.2 VwVz1Vz2((ow1z A ow2z) > W] = W2). 


Proof Let F,, G, be properties certifying that aw;z, while F2, G2 are properties 
certifying that ow2z. That is, 


wi =#F, Az =#G) A Ax (Gix1 AVY y @ (GiyAy #x1))) 
and 
Wo = #FQ Az = #G2 A Ax2(Gox2 AVy(Foy (Gay A y 4 X2))). 


We need to show that F| * F2. Since #G,; = #G2, Hume’s principle allows us to 
pick a relation R that pairs G1-objects with G2-objects. We also pick a G,-object 
x; and a Gy-object x2 as indicated above. The G,-objects are the Fj-objects plus 
x1, while the G2-objects are the F>-objects plus x. If R pairs x; with x2, then R 
already pairs F\-objects with F2-objects (since they are what is left when we have 
disposed of x; and x2). Suppose, on the other hand, that Rx 1X5 and Rx} x2 where 
x5 # x2. Then x} # x1. Furthermore, x} is an F\-object, while x‘ is an F2-object. 
We want to make an adjustment: we want x} and x}, to be paired with one another. 
Comprehension provides a relation Q that behaves as follows: 


VxVy(OQxy o (x = x4 Ay= x4) V(xF x4 Ay# X5 A Rxy))). 


We want to show that Q is the desired pairing. Suppose x is an Fj-object. If x = x}, 
then Qxx‘. Suppose x # x}. Pick a y such that Rxy. Then y ¥ xz and, hence, 
y is an F>-object. If y = Xs then x = x,, which is impossible since x; is not an 
F,-object. So y x4 and, hence, Qxy. We conclude that Q assigns an F2-object to 
every F-object. We still need to show that this assignment is unique. Suppose: 


Fix, Oxz, Oxz’. 


Case 1: x = x}. Then Qxy only if y = x5.Soz and z’ are the same F2-object, namely, 
x5. Case 2: x # x}. Then Rxz and Rxz’ and, hence, z = z’. We conclude: given any 
F\-object x, there is a unique F-object y such that Qxy. Similar reasoning shows: 
given any F>-object y, there is a unique Fj-object x such that Oxy. We conclude: 
F, ® Fy. So, by Hume’s principle, #F; = #F. 


Definition 6.4 A property F is c-CLOSED if and only if: 
VxVy((Fx A oxy) > Fy). 


A property is a-closed property if and only if all the objects that have it pass it on 
to all their successors. 


6.3. Some Number Theory 131 


Definition 6.5 x < y if and only if y has every o-closed property that x has. 


x < yifand only if y is one of the objects tireless immortal beings will encounter 
if they start with x and track down all the successors of everything they encounter. 


Theorem 6.3 < is reflexive and transitive. 


Proof Reflexivity: x < x because x has every o-closed property that x has. Transi- 
tivity: if y has every o-closed property that x has and z has every o-closed property 
that y has, then z has every o-closed property that x has. 


Theorem 6.4 VxVy(oxy > x < y). 
Proof If oxy, then x gives each of its o-closed properties to y. 
Theorem 6.5 Vz(z <0 > z= 0). 


Proof Suppose z < 0. Then 0 has every o-closed property that z has. Exercise 6.9 
implies that the property “not identical to 0” is o-closed. That is, 


VxVy((x AOA oxy) > y £0) 


since 
VxVy(oxy > y #0). 


So z # 0 only if 0 ¥ O. Since we are rewarded with an absurdity if z 4 0, we 
conclude that z = 0. 


Definition 6.6 VxVy(x <yo(x<yAx#Fy)). 
Theorem 6.6 VwVxVy((oxy Aw < y) > W<X). 


Proof Suppose F is a o-closed property that applies to w. We want to show that it 
also applies to x. Use comprehension to pick a property G such that 


Vz(Gz <> (FzAz #y)). 


Note that w 4 y since w < y. So Gw since we just supposed that Fw. y has every 
a-closed property that w has. So G can be o-closed only if y 4 y. Clearly, G is not 
o-closed. So we can pick z), z2 such that 0z;z2 even though G applies to z; and 
not z2. Since Gz,, we have: Fz, and z; # y. Since not Gz2, we have: Fz only if 
Z2 = y. Since F is a-closed and applies to z;, it applies to z2. So z2 = y. Hence, by 
Theorem 6.2, z] = x, since oz, y and oxy. So Fx, as desired. We conclude that x 
has every o-closed property that w has. 


Exercise 6.10 Show: VxVyVz((oxy Ax < Zz) > y < Zz). Hint: suppose F is a 
o-closed property that applies to y; consider the following comprehension axiom: 


AGVw(Gw @ (FwVw=x)). 
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Definition 6.7 x is a NATURAL NUMBER if and only if 0 < x. 


Natural numbers are objects that are doomed whenever 0 catches a disease that 
is transmitted to successors. Less fancifully, the natural numbers are the objects 
possessing every o-closed property that applies to 0. 


Theorem 6.7 0 is a natural number. 


Exercise 6.11 Show: every object that succeeds a natural number is a natural 
number. 


Theorem 6.8 o-closed properties that apply to 0 apply to every natural number. 


Say that something is a NUMBER if and only if it is the number of things that have 
some property. That is, z is a number if and only if z = #G for some property G. 
Then every natural number is a number. To confirm this, first recall that 0 = #9t. So 
0 is a number. Now look back at Definition 6.3 to confirm that owz only if z = #G 
for some property G. So the property of being a number is o-closed. Now just apply 
Theorem 6.8. 


Exercise 6.12 Use induction (Theorem 6.8) to show that no natural number suc- 
ceeds itself. 


Theorem 6.9 [fx and y are natural numbers, then oxy only if x < y. 
Proof Just apply Theorem 6.4 and Exercise 6.12. 


Exercise 6.13. Use induction to show that every natural number other than 0 suc- 
ceeds a natural number. 


Exercise 6.14 Use induction and Exercise 6.10 to show: if x, y are natural numbers, 
then eitherx < yory <x. 


Theorem 6.10 Jf x and y are natural numbers and y < x, then —oxy. 


Proof Exercise 6.9 guarantees that the theorem holds when y = 0. Suppose oyz. 
Our argument is inductive: we want to show that z satisfies the theorem if y does. 
Suppose z < x. We want to show that —oxz. Exercise 6.13 says we need only 
consider two cases. First: x = 0. Then, by Theorem 6.5, z = 0. So, by Exercise 
6.9 (or 6.12), —oxz. Second: x succeeds a natural number. Suppose owx. If z = x, 
then Exercise 6.12 guarantees that —oxz. Suppose z # x. Then z < x and, hence, 
by Theorem 6.6, z < w. By Theorem 6.4, y < z. So, by Theorem 6.3, y < w. Our 
inductive hypothesis is that y satisfies the theorem. So —owy and, hence, x £ y. By 
Theorem 6.2, —oxz. 


Exercise 6.15 Use induction to show that if x and y are natural numbers, then 
x < yand y < x only ifx = y. (You might find Theorems 6.3, 6.4, 6.5, 6.6, and 6.10 
useful.) 
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Exercise 6.16 Show that if x, y are natural numbers, then oxy only if 
Vwwsxow<y). 

(Again, you might find Theorems 6.6 and 6.10 useful.) 
Theorem 6.11 Suppose x1, x2 are natural numbers and 0x ,X2. If 

Vy(Fiy <> y S x1), 

Vy(Fay > y S x2), 
then o# Fi#F. 
Proof According to Exercise 6.16, 

Vy(y sx. (yS x2 Ay #x2)). 
So, since x2 < x2, 
Ax(x < x22 AVY(y S41 OY S$ X2Ay F#xX))). 


So: 
Ax(Fox AVy(Fiy <@ GayA y #x))). 


So: 
AFAG(#F, = #F A#Fy) = #G A Ax(Gx AVy(Fy © (GyA y #x)))). 


That is, c# Fi, #F>. 
Exercise 6.17 Show that Vy(Fy < y < 0) only if#F = 0. 


Exercise 6.18 Suppose x,,x2 are natural numbers and ox,x2. Further, suppose 
#F\ = x1 whenever F\ is a property satisfying the following condition: 


Vy(Fiy << y <x). 


Show that if 
Vy(Fay <> y < x2), 


then #F2 = x2 (You might show that 0x\#F> and then apply Theorem 6.1.). 
Theorem 6.12 [fx is a natural number andVy(Fy = y <x), then#F =x. 


Proof Note that the previous two exercises provide an inductive argument for this 
theorem. 
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We now know that each natural number n is the number of objects less than n. 
Suppose x < m where m is a natural number. By Theorem 6.2 and Exercise 6.13, the 
property of not being a natural number is o-closed (since, otherwise, some natural 
number would have a predecessor that both is and is not a natural number). So, since 
m has every o-closed property that x has, there is no way for x not to be a natural 
number. We conclude that each natural number n is the number of natural numbers 
less than n. 


Exercise 6.19 Show that every natural number has a successor. (According to Exer- 
cise 6.13, you only need to consider two cases: 0 and a natural number x2 that suc- 
ceeds a natural number x,. The first case is already done: we showed that o01. As 
for the second case, you might take a look at Theorem 6.11.) 


Suppose N is a property that applies to all and only the natural numbers, while NT 
is a property that applies to all and only the positive (i.e., non-zero) natural numbers. 
Then the successor relation a is a pairing between the N-objects and the N* -objects. 
So N © N* and, hence, by Hume’s principle, #N = #N7. On the other hand, 


ax(Nx AVy(Nty @ (Ny A y €x))). 


So #N succeeds #N* and, hence, #N succeeds itself. Exercise 6.12 lets us conclude 
that #N is not a natural number. Every natural number is a number, but not every 
number is a natural number. In particular, the number of natural numbers is not a 
natural number. The number of natural numbers cannot be measured by any finite 
number: not a million, not a billion, not 


1910". 


OK, that is no surprise. There are infinitely many natural numbers. What is surprising, 
even uncanny, is the way this result emerges from a lot of logical manipulations and 
a few applications of one innocent looking observation about the conditions under 
which numbers are the same: the number of F’-objects is the same as the number of 
G-objects if and only the F-objects and the G-objects are equal in number. Some- 
thing that makes Hume’s principle look all the more innocent, and the emergence of 
infinitely many natural numbers all the more uncanny, is the definability of “equal in 
number” in our background /ogic. Granted, our logic does not provide a definition of 
“number of” that lets us eliminate the operator ‘#’. ‘#’ is not eliminable in the way 
‘a’ is. So our development of arithmetic requires more than logic. But it does not 
require much more. When we get inside some of the proofs given above, it is striking 
how much of the work consists of logical manipulations of logical vocabulary— 
manipulations that make us feel like we have wandered into a logic class. FA treats 
number theory as applied logic. Just as applied mathematics can look an awful lot 
like mathematics, applied logic can look an awful lot like logic. At the very least, 
FA gives number theory a strongly logical flavor. 


6.4 FA Interprets PA 135 


6.4 FA Interprets PA 


We still need to figure out how much number theory FA provides. We can give a 
precise answer to this question if we introduce a beefier version of Peano Arithmetic 
(PA). Let the vocabulary of SECOND- ORDER PEANO ARITHMETIC consist of the 
vocabulary of PA and the whole vocabulary of FA except for the function symbol 
‘#’. Second-order PA includes all the comprehension axioms in this vocabulary. The 
remaining axioms of second-order PA are the same as those of PA except that we 
drop the infinitely many induction axioms and replace them with the single sentence: 


VF((FOA Vx(Fx > FSx)) > Vx Fx). 


To decipher the expression ‘F' Sx’ you need to remember that ‘S’ is a function symbol 
in the language of PA. ‘Sx’ is a term referring to the successor of x. The formula 
‘“F'Sx’ says that the successor of x has property F’. Our induction principle says that 
if 0 has property F and the successor of x has property F whenever x does, then 
everything in the universe of PA (that is, every natural number) has property F’. 

FA provides at least as much number theory as second-order PA because, as we 
will confirm, second-order PA is interpretable in FA. Though we will not prove 
it, the converse is also true.* So FA provides exactly as much number theory as 
second-order PA because the two theories are mutually interpretable. 

When we offer FA translations of PA axioms, we will feel free to use items of 
defined vocabulary such as ‘0’ and ‘<’. This is legitimate because our definitions 
allow us to eliminate all occurrences of the defined expressions. That means we are 
always able to recover the official version. For example, if we offer (0) as our 
translation of a PA axiom and someone objects that ‘0’ is not part of the official 
vocabulary of FA, we can offer the alternative translation 


AF (Wx(Fx a x Ax) A OFF)). 


A faithful translation will preserve correct inferences: if 7 follows from ¢1, ..., On, 
then the translation of ~ will follow from the translations of 1, ..., @,. A translation 
that behaves otherwise would not even teach us about relative consistency (Recall 
§5.4.). Here is an example of what we expect from our translations. Since (¢(0) A 
y(0)) follows from @(0) and ~(0), our translation of the former sentence should 
follow from our translations of the latter sentences. So 


AF (Wx(Fx ox $x) A (O4F) AWEF))) 
should follow from 


AF (Wx(Fx ox #x) A O#F)) 


3 See Burgess [2], pp. 147 — 150. 
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and 
AG(Vx(Gx — x 4x) A WHG)). 


Well, it does follow. If #(#F) and w(#G) and, furthermore, F and G are properties 
that apply to nothing, then, by Exercise 6.7,#F = #G and, hence, (¢(#F)AW(#F)). 
This is hardly a proof that our translations will always preserve correct inferences. 
It is just an indication of what such a proof would involve (You might try to work 
out what a real proof would look like.). 

The following definition will let us deal smoothly with occurrences of ‘S’ in 
axioms of PA. The definition is justified because each natural number has one (Exer- 
cise 6.19) and only one (Theorem 6.1) successor. 


Definition 6.8 Vx(0 < x > Vy(y = Sx © oxy)). 


We will take advantage of this definition in our translations of the PA axioms S1 
and 82. Inside of PA, we pretend that everything in the universe is a natural number. In 
FA, we allow for objects (such as the number of natural numbers) that are not natural 
numbers. So, when PA says “all x” our FA translation says “all natural numbers 
x”—that is, according to Definition 6.7, “all x such that 0 < x”. We translate $1 and 
S2 as follows. 

Vx(0 <x > OF Sx) 


WxVy((0 <x A0< y) > (Sx = Sy > x = y)) 
Exercise 6.9 and Theorem 6.2 let us verify that these are theorems of FA. 
What about the PA induction axiom given above? Is its translation an FA theorem? 
Well, Theorem 6.8 is the induction principle of FA. We can capture it in a single 
sentence of FA: 


VE((FO A VxVy((Fx A oxy) > Fy)) > Vx(0 < x > Fx)). 


This says that if 0 has property F and F is o-closed, then every natural number has 
property F’. Here is our translation of the PA induction axiom: 


VF((FO A Vx((0 < x A Fx) > FSx)) > Vx(0 < x > Fx)). 
Is this a theorem of FA? To confirm that it is, suppose FO and 
Vx((0 <x A Fx) > FSx). 
Then, by Exercise 6.11, 
Vx((0 < x A Fx) > (0 < Sx A FSx)). 


Use comprehension to pick a property G such that 
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Vx(Gx @— (0< x A Fx)). 
Then Vx(Gx — GSx) and, hence, 
VxVy((Gx A oxy) > Gy). 
That is, G is o-closed. Furthermore, by Theorem 6.7, GO. So, by the FA induction 
axiom, Vx(0 < x — Gx) and, hence, Vx(O < x — Fx). We conclude that our 
translation of the PA induction axiom is a theorem of FA. 
Our next job is to confirm that translations of Al, A2, M1, and M2 are theorems of 
FA. This will take a bit of work. Our job would be easier if comprehension supplied 
us immediately with a relation R that behaves as follows: 


VwV yi Vy2(Ryiy2 > (91 = OA y2 = w) V Axi (oxy) A dx2(Rx1x2 A ox2y2)))). 


Once we pick a w, R is supposed to be the relation that holds between two natural 
numbers when the second is exactly w more than the first. That is: 


Ryiy2 => yo=yitw. 


Our formula considers two possibilities. First, yj = 0. Note that y is exactly w more 
than 0 if and only if y2 = w: 


y=O0+w = y=w. 


That explains the clause 
(yi = 0A y2 = w). 


The second possibility is that yj succeeds x;. That is, yj = x; + 1. Note that yp is 
exactly w more than x; + | if and only if y2 — 1 is exactly w more than x,: 


y=+D+w=(y+wtl — w-l=x+w. 
Or, letting y2 = x2 + 1: 
yoatla=(qyt+)+w=(4+w)t+l & wm =x4+Ww. 


That is, yo is exactly w more than y, if and only if x2 is exactly w more than x1. 
More briefly, Ry y2 if and only if Rx;x2. That explains the clause 


Ax (ox, y1 A Ax2(Rx1x2 A ox2y2))). 
This is all very elegant. 


Unfortunately, we have a problem. The formula at the start of the last paragraph 
is not a comprehension axiom: it violates the requirement that ‘R’ not appear free 
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on the right side of a comprehension biconditional when R is the relation on the left 
side. Nonetheless, we can use our formula to define a useful notion. If, for a given 
natural number w, a relation R satisfies the formula 

Ryiy2 = (1 =0A yo = w) V Axi (oxy A Ax2(Rx1x2 A ox2y2))) 
for all natural numbers yj, y2, then we say that R is a “plus-w” relation. We need to 


show that there is a plus-w relation for each natural number w. We accomplish this 
by induction in the following exercise and theorem. 


Exercise 6.20 Show that there is a plus-0 relation. 


Theorem 6.13 Jf w is a natural number and Q is a plus-w relation, then there is a 
plus-Sw relation. 


Proof Comprehension supplies a relation R that behaves as follows: 
Ryy3 <> dy2(Qyiy2 A oy2y3). 
R is to be our plus-Sw or plus-(w + 1) relation. The idea is: 
ytwt+l=y3<o i+w) +1=y3. 
Since Q is a plus-w relation: 
Oyiy2 (1 = 0A yo = w) V Axi (oxi y1 A Ax2(Qx1x2 A ox2Y2))). 


That is: 
Oyiy2 + (01 = OA y2 = w) V Axi (ox 91 A Rx] y2)). 


We want to show that R is a plus-Sw relation. That is, 

Ryiy3 > ((y1 = 0A y3 = Sw) V Axi (ox1y1 A Ay2(Rx1y2 A oy2y3))) 
whenever y; and y3 are natural numbers. To verify the left-right half of the bicon- 
ditional, we assume that Ry, y3. This lets us pick a yz such that Qy; yo and ay2y3. 
Then: 

(v1 = 0A y2 = w) V Axi (oxy A Rx1y2). 
By Theorem 6.1, 
(1 =O0A y =w) > (1 = OA y3 = Sw) 


since owSw. Furthermore: 


Ax| (ox, y) A Rx, y2) > Axi (oxy A (Rx y2 A oy2Y3)). 
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But: 


Ax (ox, y) A (Rx y2 A oy2y3)) > Ax (oxy, A dy2(Rx1y2 A oy2y3)). 


So: 
Ryiy3 > (1 = 0A y3 = Sw) V Axi(oxry1 A dy2(Rx1y2 A oy2y3))). 
Now we need to show the converse. We consider two cases. First: 
yi = 0A y3 = Sw. 


Note that QOw and, hence, dy2(QO0y2 A ay2Sw). That is, ROSw. But then Ry, y3. 
Here is the second case: 


Axi (ox1y1 A dy2(Rx1y2 A oy2y3)). 


Pick such a y2. Then Qy; y2 since Ax; (oxy; A Rx, y2). So dy2(Qy1y2 A ay2y3) 
and, hence, Ry; y3. We conclude that R is a plus-Sw relation. 


Exercise 6.21 Suppose R is a plus-w relation. Use induction to show that, for every 
natural number yj, there is a natural number yz such that Ry, yo. 


Exercise 6.22 Suppose R is a plus-w relation. Use induction to show that, for any 
natural numbers y,, y2, y3, if Ry, yz and Ry, y3, then yz = y3. 


Exercise 6.23 Suppose Q and R are plus-w relations. Use induction to show that, 
for any natural numbers y,, y2, if Oy, y2, then Ry, yp. 


Theorem 6.14 /f y; and y2 are natural numbers and R is a plus-w relation, then 
Ry1y2 if and only if RSy, Sy2. 


Proof Note, first, that 
RSy|Sy2 <> ((Sy; = 0A Sy2 = w) V Ax (ox, Sy} A Ax2(Rx1x2 A ox2Sy2))). 
So, by Exercise 6.9, 
RSy, Syz <= 4x1 (ox, Sy, A dx2(Rx1x2 A ox2Sy2)). 
Note that oy; Sy; and a y2Sy2. So, by Theorem 6.2, 
x1 (0x1 Sy, A Ax2(Rx1x2 A ox2Sy2)) <> Ry yo. 


Exercise 6.24 Suppose R is a plus-0 relation. Use induction to show that, for every 
natural number x, Rxx. 
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Definition 6.9 If w and y are natural numbers, we let y + w be the unique natural 
number that every plus-w relation associates with y. 


Now that we have introduced ‘+’ into the language of FA, we can show that 
straightforward translations of the PA addition axioms are theorems of FA. Here is 
our translation of Al. 


Theorem 6.15 Vx(0 < x > (x +0) =x). 


Proof x + 0 is the unique natural number that every plus-0 relation associates with 
x. But, according to Exercise 6.24, that number is x. 


Theorem 6.16 VxVy((0< x AO < y) > (Sx + y) = S(x+ y)). 


Proof Let R be a plus-y relation. Then Rx(x + y) and, hence, by Theorem 6.14, 
RSxS(x + y). But RSx(Sx + y) since Sx + y is the unique natural number that 
every plus-y relation associates with Sx. So, by Exercise 6.22, (Sx+y) = S(x+y). 


Here is our translation of A2. 
Theorem 6.17 VxVy((0 <x A0 < y) > («+ Sy) = S(x+ y)). 


Proof We use induction. y is the unique natural number that every plus-y relation 
associates with 0. That is, (0+ y) = y. Sy is the unique natural number that every 
plus-Sy relation associates with 0. That is, (0 + Sy) = Sy. So 


(0+ Sy) = Sy = S(0+ y) 


and, hence, the theorem holds when x = 0. As an inductive hypothesis, suppose 
(x + Sy) = S(x + y). Then, with some help from Theorem 6.16, we can reason as 
follows: 

(Sx + Sy) = S(x + Sy) = SS(x + y) = S(Sx + y). 


Exercise 6.25 Using M1 and M2 from Chap. 2 and our definition of plus-w relations 
as guides, introduce the notion of “times-w” relations. 


Exercise 6.26 Show that there is a times-O relation. 


Exercise 6.27 Show: if w is a natural number and Q is a times-w relation, then 
there is a times-Sw relation. 


Exercise 6.28 Suppose R is a times-w relation. Use induction to show that, for every 
natural number y,, there is a natural number yz such that Ry, y2. 


Exercise 6.29 Suppose R is a times-w relation. Use induction to show that, for any 
natural numbers y,, y2, y3, if Ry; y2 and Ry, y3, then yz = y3. 


Exercise 6.30 Suppose Q and R are times-w relations. Use induction to show that, 
for any natural numbers y,, y2, if Oy, y2, then Ry, yp. 
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If x is a natural number, we can let x - w be the unique natural number that every 
times-w relation associates with x. Then we can prove the following in FA. 


Vx(0 < x — (x -0) = 0) 
VxVy((0O <x A0< y) > (x- Sy) = (x- y)+x)). 


That is, we can show that translations of the PA axioms M1 and M2 are theorems of 
FA. You might want to fill in the details yourself. 

The only remaining axioms of second-order PA are the infinitely many compre- 
hension axioms. Consider one with the following form. 


Vw FVx(Fx = 0) 
Let ¢! be the FA formula that translates the PA formula ¢. Then 
Vw FVx (Fx <> ¢') 
is a comprehension axiom of FA. This axiom implies 
Yw(0 < w > AFVx(0 <x > (Fx < ¢'))) 


which is our translation of the PA comprehension axiom. We conclude that every PA 
comprehension axiom translates into a theorem of FA. And our grand conclusion is 
that second-order PA is interpretable in FA. 


6.5 Extensions: An Epic Failure 


We will now consider a modification of FA: FREGE ARITHMETIC WITH EXTENSIONS 
(FAX).* The vocabulary of FAX is the result of dropping ‘#’ from the vocabulary 
of FA and adding the symbols ‘{’, ‘}’, and ‘:’. The OBJECT TERMS of FAX are the 
object variables and any expression '{a@: Ia}' where a is an object variable and 
I’ is a property variable. The idea is that {x : Fx} is the EXTENSION of the property 
F: it is the set of F-objects. If F were the property of being a dog, then {x : Fx} 
would be the set of dogs. As the expression “object term” suggests, extensions are 
classified as objects. We define the FORMULAS of FAX just as we did the formulas of 
FA. We obtain the axioms of FAX by dropping Hume’s principle and adding BASIC 
Law V. 


[BLV] VFVG({x: Fx} = {x: Gx} — Vx(Fx — Gx)). 


4 This is a version of Frege’s system in [4] and [5]. ‘FAX’ is just our name for this theory: do not 
expect anyone else to call it that. 


5 The ‘V’ is a Roman numeral: BLV is the fifth “basic law” from Frege’s Grundgesetze ([4], [5]). 
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The idea is: the set of F'-objects is the same as the set of G-objects if and only if the 
F-objects are the same as the G-objects. 

As reasonable as Basic Law V might sound, we will now confirm that FAX is 
inconsistent. Comprehension supplies a property G that behaves as follows: 


Vy(Gy o AF ({x: Fx} = y A—Fy)). 


Each G-object is an extension that lacks the very property of which it is the 
extension. For example, the set of all non-self-identical objects (the set of all objects 
x such that x 4 x) is a G-object because it is identical to itself and, hence, lacks the 
property (non-self-identity) of which it is the extension. The set of all self-identical 
objects (the set of all objects x such that x = x) is not a G-object because it is identical 
to itself and, hence, has the property (self-identity) of which it is the extension. 

We are going to consider the set of all G-objects: {x : Gx}. The instance of 
comprehension we just cited makes a claim about “any object y”. Letting y be 
{x : Gx}, we obtain: 


G{x: Gx} oa dF({x: Fx} ={x: Gx} A—-F{x: Gx}). 

Suppose {x : Gx} is not a G-object. That is, suppose —G{x : Gx}. Then: 

—dF({x: Fx} ={x: Gx} A —-—F{x: Gx}). 
Logic students may recognize that this is equivalent to: 

VF({x: Fx} = {x: Gx} > F{x: Gx}). 
This sentence makes a claim about “any property F’”’. Letting F be G, we obtain: 

{x : Gx} = {x: Gx} > G{x: Gx}. 
But it is certainly true that {x : Gx} = {x : Gx}. So G{x : Gx}. That is, {x : Gx} 
is a G-object. We got to this point by assuming that {x : Gx} is not a G-object. So 
we have shown: 
—G{x : Gx} > G{x: Gx}. 

It follows that {x : Gx} really is a G-object (because if it were not, then it would be 
and we would find ourselves in the absurd situation that it both is and is not). This 
means: 

AF ({x: Fx} = {x : Gx} A-—F{x: Gx}). 


Pick such an F’. Then {x : Gx} is not an F-object. Furthermore, by Basic Law V: 


Vx(Fx << Gx). 
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In particular: 
F{x: Gx}< G{x: Gx}. 


So, since {x : Gx} is a G-object, it is an F-object. That is, {x : Gx} both is and is 
not an F’-object and, so, both is and is not a G-object. We conclude that Basic Law 
V is an exceptionally plausible-looking absurdity. 


6.6 The Perils of Abstraction 


Frege’s plan was to use extensions to define the function # and to do so in a way that 
allowed him to derive Hume’s principle. He would then develop arithmetic much 
as we did above. The discovery that FAX is inconsistent left this project in ruins.° 
Frege’s failure to notice the inconsistency of FAX may appear more forgivable if we 
consider the logical kinship of Hume’s principle and Basic Law V. We begin with a 
definition. 


Definition 6.10 VFVG(F = G © Vx(Fx <= Gx)). 


Properties that apply to the same objects are said to be EXTENSIONALLY EQUIV- 
ALENT: = is the relation of extensional equivalence. Properties that apply to the 
same number of objects are said to be CARDINALLY EQUIVALENT: ~ is the relation 
of cardinal equivalence. Both extensional equivalence and cardinal equivalence are, 
indeed, EQUIVALENCE RELATIONS: that is, transitive, symmetric, and reflexive.’ 

Let us now write ‘XF’ instead of ‘{x : Fx}’. This makes the similarity between 
Hume’s principle and Basic Law V even more evident. 


VFVG(#F = #G = F ~G) 
VFVG(XF = XG @ F=G) 


Propositions of this form are known as ABSTRACTION PRINCIPLES. Hume’s principle 
says that properties have the same cardinality if and only if they are cardinally 
equivalent. Basic Law V says that properties have the same extension if and only if 
they are extensionally equivalent. It may not sound too outlandish to say that HP is 
part of the very “logic” of the concept of cardinality, while BLV is part of the very 
“logic” of the concept of extension. The formal similarity of HP and BLV tempts us 
to accord them the same logical status. Of course, we know this is wrong-headed. 
When you add HP to second-order logic, you get arithmetic. When you add BLV 
to second-order logic, you get nonsense. It is hard to imagine a more significant 
difference from a logical perspective. 


© For some of the history, see Quine [8] (http://www.jstor.org/stable/2251464), Rang & Thomas 
[9], and van Heijenoort [11], pp. 124-128. 

7 TRANSITIVITY: if F = G and G = H, then F = H. SYMMETRY: if F = G, then G = F. 
REFLEXIVITY: F = F. 
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At least one conclusion seems inescapable: if Hume’s principle is a logical 
or conceptual truth, this cannot be simply because of its logical form. Some ener- 
getic philosophers have labored hard to identify other distinctive features of Hume’s 
principle that justify assigning it some sort of exalted status. Knowledgeable people 
disagree about the extent to which these efforts have been crowned with success.® 

Even if we deny that Hume’s principle is a logical truth, we can still be impressed 
by how much of the heavy lifting in FA is performed by the logical machinery. FA 
is a departure from the set theoretic mainstream that relies heavily on a powerful 
version of classical logic. In our next chapter, we will consider an alternative to 
classical logic itself. This is another way to break free from the prevailing current. 
First, though, we consider a Fregean theory of extensions that is not known to be 
absurd. 


6.7 Monadic Frege Arithmetic 


In FAX, extensions of properties are objects to which properties apply. So it makes 
sense to ask whether a property does or does not apply to its own extension. This 
leads to disaster: the property “being the extension of a property that does not apply 
to its own extension” both does and does not apply to its own extension. 

One popular fix is to keep the extensions but lose the properties: we let the variables 
‘F’, ‘G’, ‘H’, ...range over sets while the variables ‘w’, ‘x’, ‘y’, ...range over 
members of those sets. ‘Fx’ will now mean that object x is a member of set F. It 
will often be convenient to express this in the more familiar way: ‘x € F’. We also 
express non-membership in the familiar way: ‘x ¢ F’. It is characteristic of sets that 
they are the same if they have the same members. We enshrine this principle in an 
axiom (and save a little ink by writing “VF, G’ for “VFVG’). 


Axiom 6.1 VF, G(F = G > F =G). 


This is one half of BLV with both occurrences of ‘X’ deleted. Of course, ‘XF’ no 
longer makes much sense. Sets do not have extensions, they are extensions. The 
extension of set F would be, if anything, F itself. 

In the language of FAX we could ask whether FX F—that is, whether property 
F applies to extension XF. We might now think that expressions such as ‘FF’ or 
‘“F © F’ make claims that, true or false, at least make sense. We might want to use 
such expressions to raise the question of whether set F' is a member of itself. Our 
syntax, however, does not care how we feel about this: ‘F F’ is not even a formula. 
So we cannot even ask whether the set of all self-membered sets is self-membered (as 
we asked, in Chap. 3, whether the list of all lists that do not list themselves lists itself). 
‘F F’ was nota formula in FAX either, but FAX featured a device, the “extension-of” 
operator, that blasted through the grammatical barrier by converting property terms 


8 For one noteworthy appraisal of this program of “neo-Fregeanism,” see Weir [13] (available via 
www.projecteuclid.org). See also Shapiro [10]. 
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(such as ‘F’) into object terms (such as “XF’) in a particularly unfortunate way. We 
are going to try hard to retain the grammatical prohibition on set self-membership. 
We will still use the “number-of” operator to convert set terms (such as ‘F’) into 
object terms (such as ‘#F’). Our syntax will not prevent the cardinality of a set (the 
number of its members) from being a member of that set. As far as we know, this is 
harmless. 

We are going to deviate from FAX (and, indeed, FA) in one other respect. We 
are going to jettison the entire apparatus of relation variables and relational com- 
prehension axioms. Our logic is then said to be MONADIC _SECOND- ORDER. So we 
say our new theory is MONADIC FREGE ARITHMETIC (MFA).? Object variables are 
first-order. Set variables are second-order. Set variables are, furthermore, monadic 
because each applies to only one object variable at a time: ‘Fx’ is grammatical, but 
‘“Fxy’ is not. One reason for going monadic is that it lets us investigate in detail 
the contribution that the relational machinery made to the construction of arithmetic 
in FA. We will keep careful track of the new axioms and vocabulary we adopt to 
compensate for the absence of relation variables. 

We do not have to wait long to feel our loss. Look back at Definition 6.2. We used 
relation variables to define cardinal equivalence. Although that route is no longer 
open to us, we can still treat “~’ as a defined term. We just use Hume’s principle as 
our definition. 


Definition 6.11 VF,G(F * Go #F =#G). 


Since we have not yet said how the function # behaves, this might not seem very infor- 
mative. Nonetheless, we can make a little progress with just this much information. 
First of all, we can show that © is an equivalence relation. 


Exercise 6.31 ~ is reflexive, symmetric, and transitive. 
~ cannot be just any equivalence relation, however. 


Exercise 6.32 [fall our comprehension axioms are true, then © cannot be identity: 


& cannot be =.'° 


It is an agreeable pastime to construct finite models of comprehension and axiom 
6.1. Here is a recipe for doing so. Start with n objects. This yields 2” sets of objects. 
To interpret ‘~’, partition those sets into EQUIVALENCE CLASSES: classes of sets 
equivalent to one another and to nothing else. When you perform this step, just 
make sure you put each of your 2” sets into exactly one equivalence class (You can 


° Yet again, this is just our name for the theory. 
10 That is, the function # cannot be one-to-one: 


VF, G(#F =#G > F =G). 


This is a version of a result known as “Cantor’s theorem”. GEORG CANTOR (1845-1918) was a 
pioneer set theorist. The basic problem here is that you cannot use objects to supply each set of 
objects with an avatar exclusive to that set—or, rather, you cannot do so while also making all our 
comprehension axioms true. 
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put different sets in the same class, but you must not put the same set in different 
classes.). When you say what the equivalence classes are, you are also saying how 
you interpret ‘~’: F ~ G if and only if F and G belong to the same equivalence 
class. If {F1, Fo, F3} is one of your equivalence classes, then you are saying that 
the three sets F, F2, F3 are equivalent only to one another and each one to itself. 
To interpret ‘#’, pair off equivalence classes with objects—pairing each equivalence 
class with exactly one object and each object with no more than one equivalence 
class. If we pair off equivalence class {F}, F2, F3} with object x, we are telling the 
function # to assign x to each member of {F), F2, F3} and to nothing else: we are 
saying, first, that #F, = #F) = #F3 = x and, second, that # does not assign x to 
any other sets. 

Here is an example. Suppose x is our only object. Then we have two sets of objects: 
the set of no objects ¥ and the set of all objects {x}. This leaves us two ways to form 
equivalence classes. First, we could put our two sets in the same equivalence class: 
{@, {x}}. This makes every set (all two of them) equivalent to every set. Second, we 
could put our two sets in separate equivalence classes: {4} and {{x}}. This makes each 
set equivalent only to itself. This latter approach treats ~ as identity and, as you just 
showed, is not compatible with an interpretation that makes all our comprehension 
axioms true.!! The former approach makes @ equivalent to {x}: @ ~ {x}. Given the 
one equivalence class {J, {x}} and the one object x, we must pair the former with 
the latter. That is, we must let # assign x to each member of our one equivalence 
class: #4 = #{x} = x. This interpretation makes Axiom 6.1 and all instances of 
comprehension true. Of course, it is hardly the intended interpretation since it says 
that %, a set with no members, is equivalent to {x}, a set with one member. Our 
intention is for ~ to behave like cardinal equivalence (Cardinally equivalent sets are 
sets with the same number of members.). 

In our universe of one object and two sets, the only way to get ~ to behave 
like cardinal equivalence would be to let ~ be the relation of identity: the one set 
with zero members would then be cardinally equivalent only to itself; the one set 
with one member would be cardinally equivalent only to itself. But, as just noted, this 
interpretation does not make all our axioms true. So, in the tiniest universe, we cannot 
satisfy our axioms by treating © as cardinal equivalence. This is just one instance of a 
general phenomenon: there are no finite models that treat + as cardinal equivalence. 
The reason is simple. In a universe of 1 objects, sets of objects come inn + | sizes: 
0 through n. So there will not be enough objects to serve as sizes of sets. Axioms 
that make © behave like cardinal equivalence will imply that the universe is infinite. 
Adding the right sort of information about ~ will let us prove there are infinitely 
many objects and sets of objects. In FA, we used relation variables to convey this 
information. We are now trying to do this job without relation variables. 


'I Note that you will not be able to complete the remaining step in the recipe I just gave. Why? 
Given one object and two equivalence classes, you will not be able to pair each equivalence class 
with exactly one object and each object with no more than one equivalence class. You are facing a 
crippling shortage of objects. 
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Given any object z and sets F and G, comprehension and Axiom 6.1 guarantee 
that there are unique sets @, {z}, (F UG), (F \ G), and (FM G) satisfying the 
following conditions. 


Vx(x EDO x #x) 

Vx(x € {z} << x =z) 
Vx(x € (FUG) <3 («we FVxEG)) 
Vx(x € (F\G) eo (xe FAx €G)) 
Vx(x €(FNG) os (xe FAx€EG)) 


We stipulate that 0 = #4. We retain Definition 6.3 as our account of the successor 
relation 7. Given Axiom 6.1, Definition 6.3 is equivalent to the following. !” 


Vw, zZ(awz <> AGAx € G(w = KG \ {x}) Az = #G)). 
Note that 
AGAx € Gw=#(G \ {x}) Az =#G) o AFAx € FwW=#FAZ=H#E U {x})) 


since, given G and x, we can let F be G \ {x} and, given F and x, we can let G be 
F U {x}. We now introduce a successor relation that holds directly between sets. 


Definition 6.12 VF, G(S(F, G) = Ax ¢ F (FU {x}) © G). 
Exercise 6.33 Vw, z(owz © AF, Gw=#F Az=#GA S(F,G))). 
Exercise 6.34 VF,G, H((S(F,G) \G & H) > S(F, A)). 


We define the natural numbers as we did above (though we now refer to sets rather 
than properties). Comprehension and Axiom 6.1 guarantee that there is exactly one 


2 As another ink-saving measure, we write 
"Aue T ob) 
instead of 
TAu(we I A @)1. 


We will also write 
Wuel 


instead of 
Wu(w e I > ¢$)1. 


We treat ‘¢’ similarly. For example, we write 
ray ero 


instead of 
Tau é Ao). 
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set w whose members are the natural numbers. It quickly follows that 0 € w and that 
w C F whenever F is o-closed and 0 € F (As in Definition 5.7, x C y, that is, x is 
a SUBSET of y, if and only if every member of x is a member of y.). 


Exercise 6.35 w is o-closed. 
We say that a set F is FINITE when #F is a natural number. 
Definition 6.13 VF(§(F) <= #F €w). 


The empty set J is finite. Adding one member to a finite set yields a finite set. 
Sets equivalent to finite sets are finite. 


Exercise 6.36 §(Q). 
Exercise 6.37 VF (§(F) > Vx §(F U {x})). 
Exercise 6.38 VF,G((F © GA §(F)) > §(G)). 
The following result will be a powerful tool for proving facts about finite sets. 


Theorem 6.18 Jf ¢(@) and 
VF (Q(F) > Vx O(F U {x})) 
VE, GUQ PAF & G) = OG) 


then YG(8(G) > ¢(G)). 


Proof This is actually a meta-theorem (a theorem about the existence of theorems) 
governing arbitrary formulas ¢. Comprehension lets us pick a set H such that 


Vx(x € H = 4F(¢(F) A#F =x)). 


We are going to use induction to show that every natural number is a member of H. 
0 € H, since p(Y) and #4 = 0. Suppose w € H and owz. Exercise 6.33 lets us pick 
F,, G, and x such that w = #F|,z = #G,x ¢ Fi, and (F| U {x}) © G. Since w € H 
we may pick an F such that @(F2) and #F2 = w. By Definition 6.11, Fy ~ Fy and, 
hence, $(F)). So (F; U {x}) and, hence, 6(G). So z € H. That is, H is o-closed. 
Sow C H. Suppose §(G1). By Definition 6.13, #G, € w and, hence, #G,; € H.So 
we can pick an F such that ¢(F3) and #F3 = #G,. By Definition 6.11, F3 ~ G 
and, hence, @(G1). 


His a PROPER subset of G (H C G) if and only if H C G but H # G. We say 
that F < G when F is equivalent to a proper subset of G. 


Definition 6.14 VF, G(F <G @ 3H(F ~ HAH CG)). 


Exercise 6.39 VF,G(F CG —> F <G). 
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Exercise 6.40 VF F £4 9. 

Exercise 6.41 VF(F 40 > @< F). 

Definition 6.15 VG(S.(G) = (§(G) AVF (F ~ G > F(P))). 
Exercise 6.42 J (0). 

Exercise 6.43 VF,G,H((F * GAG < H) > F < BA). 


You just showed: if F is equivalent to G and G is equivalent to a proper subset 
of H, then F is equivalent to a proper subset of H. What if we switch this around a 
bit? What if F is equivalent to a proper subset of G and G is equivalent to H? That 
is, 

FxGrH 


instead of 
FXYG HH. 


Does it then follow that F is equivalent to a proper subset of H (F ~< H)? This 
does not follow from our current axioms. In the tiny model we discussed above, 
{0} ~ @ c {O} and, hence, {0} ~ {0} ~ %. But {0} 4 W. ({0} is not equivalent to a 
proper subset of # because % has no proper subsets.) Since our variation on Exercise 
6.34 is a reasonable, powerful, but underivable claim about cardinal equivalence (our 
intended interpretation of ‘~’), we adopt it as an axiom. (You might take a moment 
to note, though, how far we have gotten with just comprehension and Axiom 6.1.) 


Axiom 6.2 VF,G,H((F «GAG A)> F <A). 
Exercise 6.44 VF(F ~§—> F=9%). 

Exercise 6.45 < is transitive. 

Exercise 6.46 VF, G(S(F, G) > F <G). 

Exercise 6.47 VF,G((F AF AFX+G)>G&G). 
Exercise 6.48 VF, G((§.(F) A F © G) > §.(G)). 


We will need to adopt another axiom. First, though, we consider another interpre- 
tation: the INTENDED one. Our objects are the natural numbers and oo. (You can let 
oo be anything that is not a natural number.) Our sets are all the sets of these objects. 
€ is membership. # assigns objects to sets as follows. 


ap al” if F has n members 
~ | oo if F is infinite 
This means that ~* is cardinal equivalence. So F < G if and only if F is the same 
size as a proper subset of G. There are two ways for this to happen: either F is finite 
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and smaller than G or F and G are both infinite. Now consider a comprehension 
axiom of the form 


VG, ..., GmVY1, --, nd FVx(x € F © @). 


For each choice of sets G1, ..., Gm and objects y1, ..., y,, our interpretation provides 
a set whose members are exactly the objects satisfying the predicate #. Our inter- 
pretation does so because it includes every set of our objects. So every instance of 
comprehension is true in our interpretation. Axiom 6.1 is true because our object 
variables range over all the members of our sets and, furthermore, our interpretation 
treats € as membership. So, if our interpretation makes 


Vx(xe FoxeG) 


true, then F and G really do have the same members and, so, are the same. 

Exercise 6.49 Confirm that Axiom 6.2 is true in the intended interpretation. 
After a useful definition, we will introduce a new axiom. 

Definition 6.16 VF, G(F 3S Go (F <GVF#G)). 

Axiom 6.3 VF, GVx(F ~ (GU {x}) > F SG). 

Exercise 6.50 Confirm that Axiom 6.3 is true in our first (tiny) interpretation. 

Exercise 6.51 Confirm that Axiom 6.3 is true in the intended interpretation. 

Exercise 6.52 VF((F U {x}) ~ (FU {x}) > F <« F). 

Theorem 6.19 VF(§(F) > F & F). 

Proof Apply Exercises 6.40, 6.47, 6.52, and Theorem 6.18. 

Exercise 6.53 VF, G(S(F) \F S GAG ZF) > FXG). 

Exercise 6.54 VF, G(S(F, G) > VH(H 3 F + H <G)). 

Exercise 6.55 VF,G, H((§(F) A SCF, G) A S(H, G)) > F & #). 

Exercise 6.56 Vx € wVy, z((oxy A ozy) > x =2Z). 


Exercise 6.57 If §(F), x ¢ F, and y ¢ G, then 
(FU {xp © (GU{y) > FXG. 


Exercise 6.58 VG(§_(G) > Vx §.(G U {x})). 
Theorem 6.20 VF, G((§(G) A F =~ G) > §(F)). 
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Proof By Exercises 6.42, 6.48, 6.58, and Theorem 6.18, VG(§(G) > §=(G)). 
Exercise 6.59 If §(F), x € F, and y € G, then 


FRG (F\ tx) * (G\ {yp. 


Exercise 6.60 VF (§(F) > Vx ¢ FVy € F F® (FU {x}) \ fy). 
Exercise 6.61 VF (5(F) > Vx, y ¢ F (FU {x}) © (FU {y})). 
Exercise 6.62 VF,G, H((§(F) A S(F, G) A S(F, H)) > GA). 
Exercise 6.63 Show that 0 has a unique successor. 
Exercise 6.64 Show that 0’s successor has a unique successor. 
Here is our final axiom. 
Axiom 6.4 VF, GVx(F < G > (FU {x}) 3G). 
Exercise 6.65 Confirm that Axiom 6.4 is true in our first (tiny) and second (intended) 
interpretations. 


Here is a table showing how our axioms behave in our two interpretations (writing 
‘C’ for the infinitely many comprehension axioms). 


Tiny Intended 


C | True’ True 
6.1] True True 
6.2|False True 
6.3| True True 
6.4| True True 


Since Axiom 6.2 can be false while the other axioms are true, 6.2 does not follow 
from those axioms. Furthermore, since those other axioms are all true in a universe 
with just one object and two sets (a universe too small and too simple to harbor 
unseen absurdities), we can have no serious doubts about their consistency. Note 
too: if we are confident that the story of the intended interpretation is coherent, we 
can be confident that all our axioms together are consistent. 


Exercise 6.66 VF (§(F) ~ VG(F <«GVG<FVF#G)). 
Exercise 6.67 VF,G, H((§(F) A S(F, G) A F © A) > S(A,G)). 
Exercise 6.68 Vx € wVy, z((axy A oxz) > y = 2). 


In your solution of Exercise 6.19, you supplied an FA proof that every natural 
number has a successor. If you go back and look, you can confirm that all the FA 
results you used in your proof have already been or can easily be reproduced here in 
MFA.!3 So we state the following theorem without proof. 


13 The following results are sufficient: Theorems 6.1-—6.5, 6.8, 6.9, 6.11 and Exercises 6.7, 6.9, 6.12, 
6.13, 6.15. 
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Theorem 6.21 Vx € wdy oxy. 


Just as in FA, we can now obtain MFA translations of the PA axioms S1 and S82. 
Axioms Al, A2, M1, and M2 (governing addition and multiplication) present new 
challenges. In FA, we used relation variables to define addition and multiplication. 
Having jettisoned the relational machinery, we need a new approach. Here is one way 
of defining the relation “z is the sum of x and y” without using relation variables. 


Definition 6.17 Vx, y,z(a(x, y,z) @ JAF,Ga =#F Ay = #GAzZ= #(FU 
G) A (F NG) =49)). 


The sum of x and y is the number of objects in the union of two disjoint sets, one 
with x members and one with y members. The relation a lets us define the relation 
“x is a multiple of y”. Actually, we define the set of multiples of y. 


Definition 6.18 Vx, y(x € wxy ~ VF(O € FAYw e FYVz(a(w, y,z) > 7 € 
F))>x€F)). 


The idea is that multiples of y are what tireless immortal beings encounter when 
they start with 0 and add y to everything they encounter. The multiples of y are 
the things that appear in every set that has 0 as a member and is closed under the 
operation “plus-y”. 

The following definition will be useful. Here the “less than” relation < is defined 
just as in FA. 


Definition 6.19 Vx, y(x € wey @ (x EWAX <y)). 


We now notice that if z 4 0, then z is the product of x and y if and only if z is 
a multiple of y and there are x multiples of y less than z. The product of 3 and | 
is 3 because 3 is a multiple of | and there are 3 multiples of 1 less than 3 (namely, 
0, 1, and 2). The product of 3 and 2 is 6 because 6 is a multiple of 2 and there are 
3 multiples of 2 less than 6 (namely, 0, 2, and 4). On the other hand, this approach 
does not work so well when z = 0. We do not want to say: 0 is the product of x and 
y if and only if 0 is a multiple of y and there are x multiples of y less than 0. This 
would imply that 0 is the product of x and y only when x = 0 (since there cannot 
be multiples of y less than 0). Note that the following definition of the relation “z is 
the product of x and y” features a clause that allows 0 to be the product of x and y 
even when x # 0.!4 


Definition 6.20 Vx, y, z(7(x, y, Zz) @ (Z EWxyA(y = OVX = #HWxy NWez)))). 


If you find this topic interesting and want a big project, you might try to prove that 
suitable translations of Al, A2, M1, and M2 are theorems of MFA. You will want to 
show that pairs of natural numbers always have a unique natural number sum and a 
unique natural number product. Here are some exercises to get you started. 


14 Cf. the definition of multiplication in §5 of Visser [12]. 
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Exercise 6.69 VF, G((S(F) A §(G)) > F(F UG)). 
Exercise 6.70 [f §(F), x ¢ F, and y € G, then 


FRG (FU{x) * (GU {yp. 
Exercise 6.71 Jf §(G,), (F, N G1) = (F2N G2) = Y, and G, © Go, then 
FLX} yb & (Fi UG{) © (Fo UG»). 


Exercise 6.72 Vy € wVx, 21, z2((a(x, y, Z1) A a(x, y, Z2)) > Z1 = 22). 


Exercise 6.73 


Vx, y1 € WW, 22((A(X, 1, 21) A 02122) > Ay2(a(x, y2, 22) A oy y2)). 


Exercise 6.74 
Vx, yt € WWy2, 21, Z2((a(%, v1, Z1) A oy y2 A 0Z122) > A(X, y2, Z2)). 


Exercise 6.75 Vx € wif x = #F. 
Exercise 6.76 Vx € w a(x, 0, x). 


Exercise 6.77 Vx, y € waz €w a(x, y, Z). 


6.8 Solutions of Odd-Numbered Exercises 


6.1 Properties that apply to the same objects apply to the same number of objects. 
Properties that apply to the same number of objects apply to the same objects. Given 
any property F’, there is a property that does not apply to the same objects as F’,, but 
does apply to the same number of objects. There is a property F that interacts with 
every property G as follows: if F applies to the same number of objects as G, then G 
applies to no objects at all. Given any relation R, if something does R to everything, 
then everything has R done to it by something. 


6.3 You might define identity as follows: VxVy(x = y — VF (Fx < Fy)). Note that 
VF (Fx < Fx). Furthermore, our definition says that identical things have the same 
properties. If you want to confirm that identical things satisfy the same formulas, sup- 
pose #(x) and x = y. Comprehension gives us an F such that Vz(Fz <= ¢(z)). By 
our definition, F'x if and only if Fy. So ¢(x) if and only if #(y). Just for fun, you might 
confirm that the following definition would work: VxVy(x = y @ VF(Fx —> Fy)). 


6.5 Comprehension lets us suppose that Vx(Ox <= (x = #9tTV x = #L)). As 
in the previous exercise, #9t 4 #D. Suppose LL ~ ®D. Then, for some relation 
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R,Vy@y > A,x(Rxy A Ux)). Note that D#N and D#U. So Ayx(Rx#M A Ur) 
and A)x(Rx#U A Ux). Since #9t is the only object that has property LU, we 
infer that RAT#M and R#MN#LU. But Vx(Le > Ayy(Rxy A Dy)). In particular, 
A, y(R#ty A Dy). Since such a y is unique, we infer that #)t = #£(—contrary to 
the previous exercise. We conclude that L{ % 9. So, by Hume’s principle, #U 4 #9. 
We need to do some more work before we can prove that there are infinitely many 
numbers. You may already be convinced, however, that something like the above 
reasoning can be repeated indefinitely. 


6.7 Suppose #F = #9t. By Hume’s principle, F ~ St. For some relation R, 
Vx(Fx > Ary(Rxy ATty)). Since nothing has property St, this implies that nothing 
has property F’. On the other hand, suppose dox Fx. Let R be any relation at all. It 
is vacuously true that Vx(F'x — 4, y(Rxy A Jty)) just as it is vacuously true that 
Vy(Mty > Ayx(RxyA Fx)).So F © Nand, hence, by Hume’s principle, #F = #5t. 


6.9 Suppose #t = #G. Then, by Exercise 6.7, nothing has property G. So there is 
no way that dx(Gx AVy(Fy = (Gy A y €x))). 


6.11 Suppose x is a natural number and oxy. Suppose F is a o-closed property that 
applies to 0. Then Fx since x has every a-closed property that 0 has. So Fy since F 
is o-closed. We conclude: y has every o-closed property that 0 has. That is, 0 < y. 


6.13 It is vacuously true that (0 4 0 > Awaw0). Suppose oxy. Then Sw owy and, 
hence, (y 40 > Awowy). 


6.15 Theorem 6.5 guarantees that 0 has the desired property. As an inductive hypoth- 
esis, suppose Vz((x < z < x) ~ x = z). Suppose oxy and y < z < y. We need 
to show that y = z. By Theorems 6.3 and 6.4, x < z. By Theorem 6.10, —ozy and, 
hence, x 4 z. So, by our inductive hypothesis, z ¢ x. By Theorem 6.6, z € y and, 
hence, y = z. 


6.17 If y < 0, then y ¥ O and, by Theorem 6.5, y = 0. So doy Fy. Now apply 
Exercise 6.7. 


6.19 Suppose ox,x2. By Exercise 6.16, Vy(y < x1 < y < x2). Suppose 
Vy(Fiy = y < x1). Then Vy(Fiy <= y < x2) and, hence, by Theorem 6.12, 
#F, = x2. Suppose Vy(Fxy = y < x2). Then, by Theorem 6.11, ox2#F. 


6.21 ROw because (0 = OA w = w). Suppose Ry y2 (as an inductive hypothesis) and 
oy ,Z1. Exercise 6.19 lets us suppose that oy2z2. Then (ay ,z1 A (Ry, y2 A oy2Z2)) 
and, hence, 4x1 (0x1z) A dx2(Rx1x2 A 0x222)). So Rz1Z2. 


6.23 If QOyo, then y2 = w and, hence, ROy2. As an inductive hypothesis, suppose 
Vx2(Qy1x2 > Ry x2). Suppose oy, z1 and Qz12z2. Then 3x1 (0x1zZ1 Adx2(Qx1x2 A 
oxX2Z2)). By Theorem 6.2, 4x2(Qy,x2 A ox2Z2) and, hence, 4x2(Ryjx2 A ox222). 
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So Rz1Z2. 


6.25 We could say: R is a times-w relation if and only if, for all natural numbers y; 
and yo, 


Ryiy2 > ((y1 = OA y2 = 0) V Axi (oxi y1 A Ax2(Rx1x2 A y2 = (42 + ))))- 
6.27 Comprehension lets us suppose that 
Wx Wy2(Rx1y2 <> Ax2(Qx1x2 A y2 = (x2 + x1))). 


R is to be our times-Sw or times-(w + 1) relation. The idea is that yz = x; - (w+ 1) 
if and only if y2 = (x1 - w) + x1. Since Q is a times-w relation: 


Oyiy2 > (1 = OA y2 = 0) V Axi (oxy A dx2(Qx1x2 A yo = (x2 + W)))). 
We want to show that R is a times-Sw relation. That is, 

Rx y2 <> (1 = 0A y2 = 0) V Az (ozx1 A Az2(R2122 A y2 = (22 + Sw)))) 
whenever x; and y2 are natural numbers. To verify the left-right half of the 
biconditional, we assume that Rx y2. This lets us pick a z2 such that Qx;z2 and 
y2 = (z2 + x1). Note that 

(x1 =O A z2 = 0) V Azi(ozimi A Ay (Qz1y1 A Z2 = (1 + w))). 
Suppose x; = 0. If z2 4 0, then oz ,0 for some z 1, contrary to Exercise 6.9. So 
Z2 = 0 and, hence, by Theorem 6.15, y2 = 0. Suppose x; # 0. Then we can pick 
Z1, y1 such that 0z}x1, Qz1y1, and z2 = (y; + w). We can show: 
yo = 22 + Szy = SQe2 + 21) = SCC] + w) + 21) = SC + 21) + w) = (9) +21) + Sw. 
Note that Rz1(y1 + z1) since (Qz1y1 A (yi + 21) = (1 + 21). So 


(oz1x1 A (R21 + 21) A yo = (1 + 21) + Sw))) 


and, hence, 
Az\(oz1x1 A dz2(Rz122 A y2 = (z2 + Sw))). 


Now we need to prove the right—left half of the biconditional. We consider two cases. 
First: x; = 0 = yo. Note that QO00 and, hence, 4x2(QO0x2 A 0 = (x2 + 0)). So ROO. 
As for the second case, pick z;, z2 such that 0z,x;, Rz1Z2, and y2 = (z2+ Sw). This 
lets us pick an x2 such that Qz)x2 and z2 = (x2 + z1). Note that Ox; (x2 + w) since 
(oz1xX1 A (Qz1x2 A (x2 + w) = (x2 + w))). Furthermore, 
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y2 = ((x2 + 21) + Sw) = ((x2 + w) + Sz1) = (2 +w) + x1). 
So Rx y2 since (Qx1 (x2 + w) A y2 = (x2 + W) + X1)). 


6.29 Suppose R is a times-w relation. Suppose ROy2 and ROy3. Then, by Exercise 
6.9, y2 = 0 = y3. As an inductive hypothesis, suppose 


Vy2Vy3((Ryiy2 A Ryiy3) > y2 = ys). 
Suppose RSy;y2 and RSy;y3. We can pick x), x2 such that ox; Sy,, Rx,x2, and 
y2 = (x2 + w). By Theorem 6.2, x1 = y; (since ay; Sy;) and, hence, Ry;x2. We 
can also pick x3, x4 such that 0x3Sy,, Rx3x4, and y3 = (x4 + w). By Theorem 
6.2, x3 = y, and, hence, Ry;x4. By our inductive hypothesis, x2 = x4 and, hence, 
y2 = (42 + w) = (%4 + W) = 93. 


6.31 Note the following 
VF #F =#F. 


VF, G#F = #G > #G =#F). 
VF,G, H(#F =#G A#G =#H) > #F =##). 
6.33 By Definition 6.11, the following are equivalent 
dF, Gw=#F Az=#GA S(F,G)) 
dF, Gix € Fw=#F Az =#GA (FU {x}) © G) 
dFax ¢ Fw=#F Az=#(F U {x})). 
6.35 Suppose x € w and oxy. Suppose F is a-closed and 0 € F. Then x € F 
because x is a natural number. So y € F because F is o-closed. We conclude that 
y isamember of aa-closed set whenever 0 is. That is, y is anatural number. So y € w. 
6.37 Suppose #F € w. If x € F, we are done because then F = (F U {x}). Suppose 
x ¢ F. Then S(F, (F U {x})) and, hence, by Exercise 6.33, c# F#(F U {x}). So, by 
Exercise 6.35, #(F U {x}) € w. 
6.39 If F C G, then F ¥ F CG. 


6.41 Suppose F 4 J. Then 4 C F. So, by Exercise 6.39, 4 ~ F. 


6.43 By Exercise 6.31, if F~ G~ H, C H, then F ~ H, C H. 
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6.45 Suppose F ~ G ~< H. Pick F,, G; such that F + Fj; CGandGG, CH. 
By Exercise 6.39 and Axiom 6.2, F; ~< G,. Pick F2 such that F) ~ Fy C G;. Then 
FR F,®} Fy CH.SoF ~H. 


6.47 If F + G <x G x F, then, by Exercise 6.43 and Axiom 6.2, F < F. 


6.49 If F is finite, Axiom 6.2 says: if F is smaller than G and G is the same size as 
H, then F is smaller than H. If F is infinite, Axiom 6.2 says: if F and G are both 
infinite and G is the same size as H, then F and H are both infinite. 


6.51 If F is finite, Axiom 6.3 says: if F' is smaller than G U {x}, then F is either 
smaller than or the same size as G. If F is infinite, Axiom 6.3 says: if F and GU {x} 
are both infinite, then F and G are both infinite. 


6.53 If §(F) and F < G < F, then, by Exercise 6.45, F < F, contrary to Theorem 
6.19. 


6.55 Suppose S(F, G) and S(H, G). By Exercise 6.46, F <~ G and H < G. By 
Exercise 6.54, H S F and F X H. Now apply Exercise 6.53. 


6.57 Suppose (F U {x}) © (GU {y}). By Exercise 6.34, S(F, (G U {y})) since 
S(F, (F U {x})). By Exercise 6.55, F © G since S(G, (G U {y})). 


6.59 By Exercise 6.39, (F\{x}) < F since x € F. So, by Theorem 6.20, §(F \{x}) 
since ¥(F). Note that ((F\{x}) U {x}) = F » G = ((G\{y}) U {y}). So we need 
only apply Exercise 6.57. 


6.61 By Exercise 6.37, §(F U {y}). If x = y, we are done. Suppose x # y. Then 
x € (F U {y}). Furthermore, y € (F U {y}). So, by Exercise 6.60, (F U {y}) ¥ 
(CF U {y}) U {xP \{y}). Since y € F, (FU {y}) U fxP\fy) = (FU {x}). 


6.63 Note that S(@, {0}) since 0 ¢ @ and (@ U {0}) © {0}. By Exercise 6.33, c0#{0}. 
Suppose o0z. Pick F,G such thatO = #F,z = #G, and S(F, G). By Exercise 
6.44, F = @ since F © @. By Exercises 6.36 and 6.62, G © {0} since S(¥, G). So 
z= #G = #{0}. 


6.65 In the tiny interpretation, (F U{x}) S G because every set is equivalent to every 
set. Now consider the intended interpretation. If F is finite, Axiom 6.4 says: if F is 
smaller than G, then F U {x} is either smaller than or the same size as G. If F is infi- 
nite, Axiom 6.4 says: if F and G are both infinite, then FU {x} and G are both infinite. 


6.67 Suppose x ¢ F and (F U {x}) & G. By Exercises 6.37 and 6.38, §(G). If 
every object belongs to H, then F C H and, hence, by Exercise 6.39 and Axiom 
6.2, F < F—contrary to Theorem 6.19. Suppose y ¢ H. Suppose G ~ (HU {y}). 
By Axiom 6.3 and Exercise 6.39, G 3 H © F < (F U{x}) © G. So, by Axiom 6.2 
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and Exercises 6.43 and 6.45, G < G—contrary to Theorem 6.19. On the other hand, 
suppose (H U {y}) < G. By Axioms 6.2 and 6.3, (H U {y}) S F. By Exercise 6.39, 
H ~< (HU {y}). So by Axiom 6.2 and Exercises 6.43 and 6.45, F < F—contrary to 
Theorem 6.19. Exercise 6.66 offers just one alternative: (H U{y})  G.So S(H, G). 


6.69 We are going to apply Theorem 6.18 using the following formula ¢(H): 
S(H) A VF, G(S(F) A GX A) > §(FUG)). 


Exercises 6.36 and 6.44 make it easy to confirm that #(%). Suppose, as a kind of 
inductive hypothesis, that 6(H).If x € H, then 6(H U {x}) because (H U {x}) = H. 
Suppose x ¢ H. Then ((H U {x})\{x}) = H. Suppose §(F) and G © (A U {x}). 
By Exercises 6.37 and 6.38, §(G). By Exercise 6.44, G 4 4. Suppose y € G. Then, 
by Exercise 6.59, (G\{y}) © H. So, by our inductive hypothesis, §(F U (G\{y})) 
and, hence, by Exercise 6.37, §(F UG). That is, 6(H U{x}). Now suppose ¢(H) and 
HH © Hp. By Exercise 6.38, §(H2). Suppose §(F) and G ~ H2. Then G ~ Hj and, 
hence, §(F U G). So (Az). Theorem 6.18 lets us conclude: VH (§(H) > ¢(A)). 
SoVH(S(A) > VF(B(F) A A & A) > §(F UA)). Now apply Exercise 6.31. 


6.71 We use induction on x to show: if x € w, then G, satisfies the theorem whenever 
#G, = x. First, suppose #G; = 0. Then, by Definition6.11 and Exercise 6.44, 
G, = 9. If G; ~ Go, then, by Exercises 6.31 and 6.44, Gz = @ and, hence, F) ~ Fo 
if and only if (F) U G1) © (Fo U G2). Now, as an inductive hypothesis, suppose x 
has the desired property. Suppose oxy. Use Exercise 6.33 to pick G;, Hj such that 
x = #M, y = #G\, and S(Aj, G1). Definition6.12 lets us suppose w ¢ Hy, and 
(H, U {w}) © Gy. By Exercise 6.44, G; 4 Y. Suppose z; € G;. Then, by Exercise 
6.59, (Gi\{z1}) * Ay. By Definition6.11, #(G1\{z1}) = x and, hence, by our 
inductive hypothesis, (G1 \{z1}) satisfies the theorem. That is, given any F|, F2, H2, 
if (FLA (Gi\{z1})) = (F2 9 Aa) = G, and (G1 \{z1}) © Ap, then 


Py © Fo <> (FU (G1\{21})) G2 U Bp). 
Suppose (F, 0 G1) = (F2M G2) = @ and G, & G2. Since G, 4 @, Exercise 6.44 
guarantees that G2 4 %. Suppose zz € G2. Then, by Exercise 6.59, (G1\{z1}) © 
(G2\{Z2}). So 
PF, © Fy <> (FL U (G1\{z1})) © (Fo U (G2\{z2})). 


But, by Theorem 6.20 and Exercises 6.39, 6.57, 6.69, and 6.70, (Fi U (G1 \{z1})) © 
(Fy U (G2\{z2})) if and only if 


(Ft U (Gi\{z1})) U fz) © (CF2 U (Ga\{z2})) U {22)). 


So F, © F» if and only if (F] UG 1) © (F2 U G2). That is, G; satisfies the theorem. 
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6.73 We begin with some assumptions. 


()x =#F (5) 21 = #A 

(2) y1 = #G (6) 22 = #Hy 

Bz =#FUG) OMwig¢g Mh 

4 (FNG)=% (8) (MU {wi}) © Ap 


Note that, by (1), (2), and Exercise 6.69, §(F U G) since x, y; € w. By (3), (5), 
and Definition6.11, (F UG) * Hy. If w2 € (F UG) for every wo, then, by (7), 
A, Cc (FUG) and, hence, by Exercises 6.39 and 6.43, (F UG) ~ (FUG), 
contrary to Theorem 6.19. Suppose w2 ¢ (F U G). Then, by (2) and Exercise 6.33, 
ay1#(G U {w2}). Furthermore, by (7), (8), and Exercise 6.70, 


(FU (GU {w2})) = (FUG) U {w2}) © (A U {wi }) © Ab. 


So, by (6) and Definition 6.11, z2 = #(F U(GU {w3})). By (4), (FN (GU{w2})) = &. 
So, by (1), a(x, #(G U {w2}), z2). 


6.75 We use induction. Recall that 0 = #% and then apply Definition 6.3. 


6.77 Induction on y. Exercise 6.76 handles the case when y = 0. Suppose a(x, y1, Z1) 
and ay; y2. Theorem 6.21 lets us suppose 7z1z2. Now apply Exercise 6.74. 
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Chapter 7 
Intuitionist Logic 


7.1 Inference 


We are going to use a fable to show three things. First, speakers of a language free of 
the usual logical vocabulary (in particular, without the resources for making “if... 
then” statements) can, nonetheless, have a well-developed conception of correct 
inference. Second, speakers of such a language can adopt a locution allowing them 
to assert that one sentence is inferable from another. Third, the result of this innova- 
tion will be a fragment of something called “intuitionist logic.” The upshot is that 
intuitionist logic can, at least in part, be construed as a medium for discussing and 
reasoning about reasoning. This idea of a logic emerging from a pre-existing practice 
of inference is not new to us: we already saw this happen in Chap. 1. There, however, 
it was classical logic that emerged from PRA—and, as we shall see, classical logic 
is incompatible with the intuitionist logic we are trying to develop in this chapter. 
We will have to be careful not to follow the pattern of Chap. | too closely. 

Suppose we are observing some people, the Peregrins, who reliably distinguish 
between good and bad inferences (accepting only the former), but lack the vocabulary 
to say that a conclusion follows from some premises.! Here is a possible dialogue. 


VILLAGE ELDER: The repeating decimal 0.123123123... is rational. 
INQUISITIVE YOUTH: How do you know that? 


VILLAGE ELDER: 123 
0.123123123... = —.—_. 
10° — 1 


INQUISITIVE YOUTH: OK, but how do you know that? 


The youth accepts the inference from premise to conclusion. (It is apparent that 123 
divided by 10° — 1 is rational.) Nonetheless, the youth would like a reason to believe 


' The name ‘Peregrin’ is a tribute to JAROSLAV PEREGRIN whose paper [7] outlines the approach 
to intuitionist logic we pursue in this chapter. 


S. Pollard, A Mathematical Prelude to the Philosophy of Mathematics, 161 
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the premise itself. (It is not so apparent that the repeating decimal is exactly that 
rational number.) The elder is only too happy to oblige. 


VILLAGE ELDER: 
(10° — 1) x 0.123123123... = 123.123123... — 0.123123123... = 123. 
INQUISITIVE YOUTH: Right. 


The youth accepts the inference from the new premise to the prior premise. (Just 
divide through by 10° — 1 in the new premise to obtain the prior one.) The youth 
also accepts the new premise itself. (To verify the new premise, you only need to 
perform the indicated operations.) 

As I already mentioned, it is a distinctive feature of Peregrinese that the inquisitive 
youth can endorse the inference from the new premise to the prior one, but cannot 
assert that the one follows from the other. We can do so easily enough 


123 


a if (10° — 1) x 0.123123123... = 123 
103-1 


0.123123123... 


but there is no Peregrinese translation of our ‘if’. We can also assert that the elder’s 
initial statement follows from the earlier premise: 


1 
0.123123123... is rational if it is equal to Ie-1° 
Again, the youth cannot say this. 

If the inference from some Peregrinese sentences ¢1,..., @, to the Peregrinese 
sentence 7 is judged correct by the Peregrins, we record this by writing 


{O1,---, On} Fy. 


Statements of this form are known as SEQUENTS. In some cases, the Peregrins will 
assent to a sentence y without benefit of inferential support. We then record the 
sequent 


Dew. 


By treating inference as a relation between a set of premises and a conclusion, we 
capture two important features of Peregrinese behavior. First, in assessing inferences, 
the Peregrins are indifferent to the order in which premises are given. If they accept 
the inference from premises ¢1, ¢2, presented in that order, to conclusion w, they 
will accept the inference from premises ¢2, ¢1, presented in that order, to conclusion 
w. This is reflected in the set theoretic fact that {f1, 62} = {@2, d1}. Second, in 
assessing inferences, the Peregrins are indifferent to repetitions of premises. They 
will accept the inference from premises $1, 61, 62 to conclusion ¢ if and only if they 


7.1 Inference 163 


will accept the inference from ¢1, ¢2 to w. This is reflected in the set theoretic fact 


that {61, 61, G2} = {1, G2}. 


Three other features of Peregrinese behavior are worth mentioning. We will 
express them as axioms governing the inferability relation -. First, they will accept 
any inference from a sentence to itself. 

Axiom 7.1 (TAUT) 
{o} F 6. 


“Taut’ is short for ‘tautology’. Second, if they accept an inference, they will continue 
to accept it when premises are added. 
Axiom 7.2 (THIN) 

Pru => PU{d}F vy. 


We use capital Greek letters (such as ‘®’) to stand for sets of sentences. ‘Thin’ is 


short for ‘thinning’. Third, if they accept the inference from ¢),..., @, to wv, they 
are willing to replace w with ¢1, ..., dn in any correct inference where w serves as 
a premise. 


Axiom 7.3 (CUT) 
Wey, PUfY}FyY => OUWEY. 
A particularly simple form of the Cut principle is: 


{pH y, (Yh x = {Po} Fx. 
For example, having accepted the inferences 


123 
(10° — 1) x 0.123123123... = 123 + 0.123123123... = jE 


0.123123123...= F 0.123123123... is rational 


103-1 
the inquisitive youth will accept the inference 


(10° — 1) x 0.123123123... = 123 - 0.123123123... is rational. 


The Peregrins treat the inferability relation as transitive. 


7.2 Conjunctions 


The Peregrins are careful to distinguish between sentences and clauses within sen- 
tences, giving a characteristic wink to indicate the end of a sentence. It is always 
clear when 
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Sentence. Sentence. 


is intended rather than 
Clause; clause. 


We indicate the former structure by writing 


Pi, o2 


and the latter by writing 
(p1 & $2). 


When they assess inferences, the Peregrins treat 
Premise. Premise. 


as equivalent to 
Premise; premise. 


We capture this practice in the following axiom. 


Axiom 7.4 (AND) 


{Pi,gp- ov = (G1 &d2)} Fy. 


The Peregrins are willing to accept the inference from ¢1, ¢2 to w if and only if they 
are willing to accept the inference from (¢; & ¢2) to wv. 


Theorem 7.1 {(¢; & ¢2)} + @. 
Proof Here is a formal derivation. 
{gi} F d — Taut 


1. 
2. {b1, d2} F g 1 Thin 
3. {(¢1 & b2)} F b1 2 And 


This will be our standard format for verifying sequents: a series of numbered lines, 
each featuring a sequent and a justification. The justification ‘1 Thin’ on line 2 means 
that we obtained the sequent on line 2 by applying the principle Thin to the sequent 
on line 1. 


Theorem 7.2 {(¢; & ¢2)} - do. 


Exercise 7.1 {1,2} + (¢; & ¢2). 
Theorem 7.3 ®U{d},d.}- v > GU{(d, Kd) Fw. 


7.2 Conjunctions 165 


Proof In this case, we are not verifying a sequent: we are providing a recipe for 
passing from one sequent to another in the course of such a verification. When we 
write down the starting sequent, our justification is ‘Prem’ (for ‘premise’). 


I. PU{d1,~prw Prem 
2. {(o1 & G2)} F 1 Thm 7.1 
3. DU{(d1 & b2), d2} Hw 1,2 Cut 
4. {o1 & d2} F G2 Thm 7.2 
3, PU {(o1 & d2)} Hw 3,4 Cut 


We allow ourselves to write down sequents already verified, citing the relevant the- 
orem (‘Thm’) or exercise (‘Ex’). Here is a more readable version of the derivation 
without set theoretic notation and Greek letters. 


1 P,Q,RES Prem 
2 (O&R)+t O Thm7.1 
3. P,(Q&R),RES 1,2 Cut 
4 (O&R)t R Thm7.2 
5 P,(Q&R)FS 3,4Cut 


We will use this streamlined format whenever it seems helpful. 
We have shown 
P,Q,RFS => P,(Q&R)FS 


without using any special information about ‘P’, “Q’, ‘R’, and ‘S’. The proof would 
have worked just as well with other formulas. So any time you have a sequent of 
form 

P,Q,RES 


with ‘P’, ‘Q’, *R’, and ‘S’ as shown or replaced by other formulas (or even with 
the idle letter “P’ replaced by several formulas), you can derive the corresponding 
sequent 

P,(Q&R)EFS 


citing “Thm 7.3’. For example, from 
Si,..-,5Sn,(O0&P), REP 


you can derive 
Si,.-.,Sn,(Q& P)& R)F P 


on the authority of Theorem 7.3. 


Exercise 7.2 P,(Q&R)FS > P,Q,RES. 
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Exercise 7.3 S;,P/ Q and S2,PFR => Sj, S2,P/(Q&R). 
Exercise 7.4 (P&Q)/ (QO&P). 
Exercise 7.5 (P&(Q&R))- ((P&Q)&R). 


7.3 Conditionals 


We now start to wonder what sort of expansion of Peregrinese would allow the 
Peregrins to assert that one sentence is inferable from others (that is, assert it by 
using a sentence that says it, not just by assenting to the inference when it is made). 
One device would be a new Peregrinese symbol ‘>’ that expresses the relation of 
inferability. Affirming (¢ > w) would then be equivalent to endorsing the inference 
from ¢ to 7. So ‘b>’ should have the following properties. 

First, Peregrins should accept (¢ > w) whenever they accept the inference from 


og tow. 
Axiom 7.5 (DED)? 
{JF > > BOF@ry). 


Second, from ¢ and (¢ > w), Peregrins should be prepared to infer ~. 
Axiom 7.6 (MP)? 
{g, (@ > P}F y. 


Third, if Peregrins accept the inference from ¢ to 79 whenever they accept the 
inference from ¢, to w1, then they should accept the inference from (¢; > ~ 1) to 
(é0 & Wo). More generally, given 


{di} vi,---, {Gab} Un => {do} vo 


2 The name ‘Ded’ recalls a result, known as the deduction theorem, of which our axiom is a special 
case. We still need to confirm the full deduction theorem: that is, we still need to confirm that 
(@ & w) is inferable when the derivation of ~ from ¢@ depends on additional premises @: 


@U{g Fy => GF (vy). 


Axioms 7.7, 7.8, and 7.9 will help us do so. This will show that the Peregrins should accept the 
inference from ® to (¢ & w) whenever they accept the inference from ® U {¢} to w. Perhaps 
we should have just adopted that as an axiom. I saw two reasons not to take that course. First, I 
thought Axiom 7.9 deserved separate discussion because of its special role in distinguishing between 
inferability and strict implication (to be discussed below). Second, I thought it quite evident that 
Axioms 7.7 and 7.8 belong in a logic of inferability, but less immediately evident that the deduction 
theorem does. Others may see the matter differently. 


3 “MP” stands for ‘modus ponens’ . 
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we should have 


{(G1 & Y1),--+5 (Gn & Yn} F (bo & Yo). 


This means that Cut and Exercise 7.3 yield the following axioms. 


Axiom 7.7 (CUT) 
{o> py), Wo xX} GD x). 


Axiom 7.8 (AND) 


{x > Kb DIF OO G&Y)). 


As I already mentioned, our target is something known as INTUITIONIST LOGIC. 
At the moment, we are trying to introduce the rules for the intuitionist conditional. 
We are still a bit short of our goal. So far, our connective > is indistinguishable from 
connectives you may have encountered in a logic class: the strict implications -3 of 
various modal logics. We need to add an axiom to make it clear that > is not 3 
(because we know that the intuitionist conditional is not -3). 


Axiom 7.9 (INT) 
{y}F (Ob y). 


You might think that Int is unnecessary since Ded and Thin yield: 
O6r-yp => OFM@PY). 


If Peregrins accept ~, they accept (@ > w). It does not follow, however, that they 
should accept the inference from ~ to (¢@ > w). Students of modal logics such as 
T, S4, and S5 can confirm that it does not follow. If 7 is logically true, then so is 
(@ 3 w). However, (¢ 3 W) will follow from w only in special cases. (The truth of 
w does not guarantee that ~ will be true in every possible world where ¢ is true.) 

Is Int faithful to the Peregrinese practice of inference? Ded does not guarantee 
this. Perhaps we could acquire behavioral evidence that Int captures some aspect of 
Peregrinese reasoning. Here is a dialogue that might be relevant. 


VILLAGE ELDER: @. 

INQUISITIVE YOUTH: Well, I’m willing to suppose so, just for the sake of argument. 
So what? 

VILLAGE ELDER: 7! 

INQUISITIVE YOUTH: Huh? 


A plausible interpretation of this discourse is that the elder has inferred 7 from ¢ 
and that the youth is questioning that inference. The elder now offers support for that 
inference. (Suppose “Huh?” is a standard way of soliciting support for an inference. 
We would then expect the elder to respond by offering such support.) 
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VILLAGE ELDER: w. 

INQUISITIVE YOUTH: OK, though I'd be interested to know why I should believe 
wy. 
The youth accepts something. It is not ~. The youth could be acknowledging that 
w provides a good reason to accept the inference from ¢ to w. If so, Int would be 
justified by the way Peregrins assess inferences. We will take it for granted that Int 
is justified in this way. 

Here is a result we will need for the proof of Theorem 7.4. 


Exercise 7.6 | (O > Q). 


As part of our streamlining process, we leave a blank to the left of ‘-’ when there 
should officially be ‘4’. An official version of the sequent in the above exercise would 


be BE (we wy)’. 
Theorem 7.4 P+ (Ob (P&Q)). 


Proof Here is a formal derivation. 


1.(Q@b P),(Q@>Q)F-(Q@b(P&Q)) PAnd 
2. + (O>Q) EX 7.6 
a: (O> P)K (Ob (P&Q)) 1,2 Cut 
4. PE (Ob P) Int 
5 PE (Op (P&Q)) 3,4 Cut 


Theorem 7.5 P,Q R => PF(QC R). 


Proof The formal derivation below provides a recipe for getting from one sequent 
to the other. 


dh, P,QER Prem 
2. (P&O)ER 1 And 
3. Ft ((P&Q)b>R) 2Ded 
4.(0>(P&Q)),(P&O)OR)F (OCR) Cut 
5. (O>(P&Q))F (Ob R) 3,4 Cut 
6. PF (Qvp(P&Q)) Thm7.4 
As Pt(QOcpR) 5,6 Cut 


Theorem 7.6 P,,..., Pn, QR => Pi,..., Pr - (QP R). 


Proof The proof is by induction on the number of premises P},..., Py. Theorem 
7.5 handles the case when n = 1. Suppose 


Pi,..., Pot, Pa, Poot, QF R. 
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Then, by Theorem 7.3, 


Pi,..., Pn—1, (Pn & Pn4i), OF R. 


Notice that there are now only n premises other than Q. So, by inductive hypothesis, 


Py, ..., Pu—1, (Pn & P+) r (Q > R). 
Now just apply Exercise 7.2. (Note that in Exercise 7.2 the one sentence P could 
have been a set of sentences @.) 


Exercise 7.7 P,,..., Pn (QR) => Pi,..., Pn, QE R. 


Theorem 7.7 P,,...,P,; (Q0 R) => P,,...,Pn,QER. 


Exercise 7.8 (P > (QP R))F} (PO QO)> (PC R)). 


7.4 Negations 


No one knows when the Peregrins first realized that some statements of arithmetic 
allowed them to infer every statement of arithmetic. 


VILLAGE ELDER: 0 = 1. 

INQUISITIVE YOUTH: [incredulously] What? 

VILLAGE ELDER: [patiently] Just suppose 0 = 1. 

INQUISITIVE YOUTH: OK, I'll play along, just for the sake of argument. 


VILLAGE ELDER: 
INQUISITIVE YOU 
VILLAGE ELDER: 
INQUISITIVE YOU 
VILLAGE ELDER: 
INQUISITIVE YOU 
VILLAGE ELDER: 
INQUISITIVE YOU 
VILLAGE ELDER: 
INQUISITIVE YOU 
VILLAGE ELDER: 
INQUISITIVE YOU 
VILLAGE ELDER: 
INQUISITIVE YOU 
VILLAGE ELDER: 
INQUISITIVE YOU 


3+0=3+1. 
TH: Yep. 
3=4. 

TH: Yep. 
4=2x2. 

TH: Yep. 
3=2x2. 

TH: Yep. 

3 is even. 

TH: Yep. 
3+0=4+1. 
TH: Yep. 
cae 

TH: Yep. 

5 is even. 


TH: Yeah, yeah, I think I get the point. 
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We have 
O=I1Fwy 


whenever w is a statement of Peregrinese arithmetic. So, by Cut, 


@F0=1 > Oty 


whenever 7 is a statement of Peregrinese arithmetic. For Peregrins, inferring ‘O = 1’ 
from an interlocutor’s premises is like saying, “If you believe that, you’ll believe 
anything about the natural numbers.” At some point, they acquired the ability to say, 
in effect, “If you believe that, you’ll believe anything.” They have a pronouncement 


Balderdash! 


or, more briefly, 
tL 


from which they are prepared to infer any declarative Peregrinese sentence. 


Axiom 7.10 (EFQ)* 
Ley. 


Theorem 7.8 @+&L=> Oty. 


After reading our discussion of conditionals, the Peregrins quickly adopted the 
neologism ‘>’ and soon combined it with ‘L’ to expand their logical vocabulary 
even more. They began to use 

Oo 


as shorthand for 
(@ >). 


Definition 7.1 You can freely interchange @¢ and (¢@ > L). 
@ is akind of stop sign. A statement of the form 
Oo 


is a warning that, if you accept ¢, you risk trivializing future inferences by allowing 
yourself to infer everything. A statement of the form 


(¢ > Ov) 


is a warning that, if you accept ¢, you had better not accept w. 


4 ‘RRQ’ stands for ‘ex falso quodliber’: “from a falsehood anything [follows].” 


74 Negations 171 


Theorem 7.9 /} QL. 
Proof *@ ’ is shorthand for ‘(-L >L)’. So we just apply Exercise 7.6. 


Notice that I did not use EFQ in the above proof. I would like you, too, to avoid 
EFQ in the following exercises. You should feel free, however, to use Definition 7.1 
to replace sentences of the form @@ with ones of the form (¢ >L). 


Exercise 7.9 (P > Q)| (OOP OP). 

Exercise 7.10 R,,...,Rn, PEO => Rj,...,Rn, QQE OP. 
Exercise 7.11 (P > OQ) (Ob OP). 

Exercise 7.12 (P > @P)| @P. (Remember Exercise 7.8.) 
Exercise 7.13 Lt OP. 

Exercise 7.14 P/ OQ P. 

Exercise 7.15 OQ OPE OP. 

Exercise 7.16 @@ LFL. 

Exercise 7.17 Pi,...,P,- QQ <= > Pi,...,Pn,Q"L. 
Exercise 7.18 @(P > Q)F OO 

Exercise 7.19 ©@(P > Q)t (P> OO Q). (Note that P, OQ O(P > Q).) 


You have obtained three versions of CONTRAPOSITION (Exercises 7.9, 7.10, and 
7.11) from the first of which you can easily obtain the principle of modus tollens: 


(P > Q),@QF OP. 


You have shown that a statement must be absurd if it yields its own absurdity (Exer- 
cise 7.12). You have shown that the absurdity of everything follows from absurdity 
(Exercise 7.13). You have confirmed the principle of DOUBLE NEGATION INTRO- 
DUCTION (Exercise 7.14). You have obtained two versions of DOUBLE NEGATION 
ELIMINATION (Exercises 7.15 and 7.16). And more (Exercises 7.17—7.19). Strangely 
enough, you have done all this without using EFQ. But EFQ is our only information 
about what the sentence *_L’ might say. In the absence of EFQ, ‘L’ could assert a 
logical truth rather than an absurdity. Indeed, *‘L’ could then say anything. 

If we want ‘@P’ to express the rejection or negation of ‘P’, then it matters 
tremendously what ‘L’ says. (Noting that a logical truth can be inferred from ‘P’ is 
not an effective way of expressing disapproval of “P’.) It may be surprising, then, that 
so many properties thought to be characteristic of negation follow from Definition 
7.1 and facts about ‘>’.° 

From now on, feel free to use EFQ. 


5 This is just one of the many insights to be found in Martin [6]. I will be borrowing exercises from 
that book for the rest of this chapter. 
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Exercise 7.20 P,@P' Q. 

Exercise 7.21 Q@(P > QO) OOP. 

Exercise 7.22 (CO > @P)- @@(PC Q). 

Exercise 7.23 (P> O@ Q)- © O (PC Q). (Recall Exercise 7.11.) 


Exercise 7.244 @(P& @ QO) @O(P PC Q). (Note Exercise 7.23.) 


7.5 Absurd Absurdities 


Suppose it would be absurd for ¢ to be absurd: that is, suppose @ @ ¢. Can we safely 
infer @ itself? Exercises 7.15 and 7.16 showed that this inference is sometimes war- 
ranted. Is it always warranted? Well, under what circumstances would the absurdity 
of ¢ trivialize the Peregrinese practice of inference by allowing the Peregrins to infer 
everything? (This is what it would mean for ¢’s absurdity to be absurd.) One such 
circumstance is that ¢ is a Peregrinese THEOREM, something they accept without 
argument. If 


OE od 


then, by Exercise 7.14 and Cut, 
OF OO?¢. 


It would be absurd for a theorem to be absurd. Is that the only situation in which it 
is absurd for something to be absurd? Could we have a reason to assert the absurdity 
of a sentence #’s absurdity without having a reason to assert 6? 

It seems hard to imagine such a situation. It may be instructive to consider a case 
that turns out not to be what we are looking for. Suppose the Peregrins have a kind 
of generic statement 


Thingamajig is a whatchamacallit 


or, more briefly, 
a. 


The generic statement is like a black box. Since its contents are inaccessible to 
the Peregrins, whatever they can prove about the black box, they can prove about 
anything. (That is the sense in which it is generic.) In particular, if they show that 
the black box is absurd, then they can show that anything is absurd. But that would 
be absurd. Isn’t that a good reason for the Peregrins to assert the absurdity of the 
absurdity of the black box even though they have no reason to assert the black box 
itself? 

We need to look into this more carefully. To start, we need to say exactly what it 
means for the black box to be generic. Let 9(M) be a sentence in which ‘@ occurs. 
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Let 6(w) be the result of replacing every occurrence of ‘MP in 0(M) with an occurrence 
of the sentence w. Then the black box obeys the following principle. 


Proposition 7.1 (GEN) 


Aro) => GF AW). 


If the Peregrins can prove that the mysterious black box has a property, then they 
can prove that every sentence has that property. We can now show that the Peregrins 
can prove an absurdity if they can prove the black box. 


Proposition 7.2 |} HM >FL. 


Proof The proof only takes two lines. 


1. | @ Prem 
2. - L 1Gen 


The Peregrins can also prove an absurdity if they can prove the absurdity of the black 
box. 


Exercise 7.25 | @@ =>tTL. 


It might be tempting to treat the preceding result as a proof that it would be absurd 
for the black box to be absurd: 
-OOm 


or, equivalently, 
@M@-TL. 


The next result shows that this is correct only if the Peregrinese system of inference 
allows them to prove absurdities. 


Exercise 7.26 | @OM >FL. 


In general terms, the situation is this. If a certain sentence @ is a Peregrinese 
theorem, then so is a certain sentence ~ 


Dro > BEY. 


(Exercise 7.25 says: OM = FL.) Yet, if the Peregrinese system is consistent, the 
Peregrins cannot verify the inference from ¢ to 7; that is, they cannot verify 


or wv. 


(They cannot verify OM | _L.) As we saw earlier in connection with Int, the inference 
from ¢ to w is not the same as the inference from the theoremhood of @ to the 
theoremhood of 7. When Peregrins ask you to assume a premise for the purpose of 
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deriving a conclusion, they are not asking you to assume that the premise is a theorem 
for the sake of showing that the conclusion is a theorem. Assuming a premise is not 
the same as assuming that the premise is provable. That is, it is not intrinsic to the 
very concept of inference that these assumptions are interchangeable. 

Our hope was that a sentence whose status is, in principle, indeterminate would 
be an example of a sentence whose absurdity is demonstrably absurd even though the 
sentence itself is not provable. This did not work out. No need to despair, however. 
The following theorem introduces a useful technique for deriving the absurdity of 
a sentence’s absurdity without deriving the sentence itself. The general idea is that 
if ~ is derivable from both ¢@ and @¢, then it would be absurd for 7 to be absurd 
because the absurdity of 7 would (by contraposition) yield both @¢ and @ @ ¢. 


Theorem 7.10 @U{¢d}F vy, WU{@d}F-v > GUWF OO”. 


Proof See the formal derivation below. 


iL. PU{o} Fw Prem 
2: YW U{@g} Fw Prem 
ae PU{Ov} + Od 1 Ex 7.10 
4. WU{Ov} F OO 2Ex7.10 
5. O¢,00o@rFL Ex 7.20 
6. PU{OV,@@ dF L 3,5 Cut 
7 PBUWU{OyY}FL 4,6 Cut 
8. 


PUWE OOwWT7Ex7.17 


We will now use Theorem 7.10 to prove the absurdity of the absurdity of the 
following instance of PEIRCE’S LAw.° 


((P > Q) > P) > P) 


This sentence is not a theorem of our formal system nor would anything we know 
about the Peregrins lead us to think that it ought to be.’ Nonetheless, we now prove 
its double negation. 


Theorem 7.11 | O@ (((P > Q) > P)> P). 


Proof See the formal derivation below. 


© CHARLES SANDERS PEIRCE (1839-1914) was an American logician, mathematician, philosopher, 
and scientist. 

7 Tf you can infer P from the inferability of Q from P, can you safely infer P'? That move may 
be justified; but its justifiability is not intrinsic to the very concept of inference. By the way, it is 
not obvious that ‘(((P > Q) & P) & P)’ is unprovable in our system. There are proofs of its 
unprovability, but I will not supply any here. If you want to pursue this further, you might consult 
the WIKIPEDIA article on Heyting algebras (http://en.wikipedia.org/wiki/Heyting_algebra). 
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1. PF((P?PQ)C P)cP P) Int 
2. P,@PtQ Ex 7.20 
3. @Pt (PP Q) 2 Thm 7.5 
4.(P>Q),(POQ)>P)FP MP 
5. OP, (P>O)>P)EKP 3,4 Cut 
6. @P+ (PB Q)b P) b> P) 5 Thm 7.5 
7. +t @@((P& Q) > P)& P) 1,6 Thm 7.10 


We can learn an important lesson from Theorem 7.11. If each instance of the 
double negation elimination priniciple 


OOgotro 


were provable, then Theorem 7.11 would yield the instance of Peirce’s law that 
we know to be unprovable. Exercises 7.15 and 7.16 showed that it is sometimes 
permissible to erase pairs of @’s. We now see that such inferences are not always 
accepted in the land of Peregrin. 


Exercise 7.27 Show that if every instance of Peirce’s law is a Peregrinese theorem, 
then: @ @ PF P. (You need to find a formula of the form ((¢ > w) > @) & @) 
that yields the desired result.) 


Exercise 7.28 | @@(O@@P PP). 
Exercise 7.29 ((P > QO) > P)F OOP. 


Some instances of Peirce’s law are Peregrinese theorems. For example, if ¢ is any 
Peregrinese theorem, then Int guarantees that 


(@>¥) > g)> g) 


is a Pergrinese theorem no matter what ~ is. Here is another example that you can 
verify without using EFQ. 


Exercise 7.30 | ((0@ P >L) > @©@ P) & O@ P). (You might want to use 
Exercises 7.12 and 7.14 replacing ‘P’ with a sentence more useful to us now.) 


7.6 Disjunctions 


Suppose the Peregrins introduce an expression ‘vy’ that behaves as follows.® 


8 Tam not going to try to present ‘vy’ as a natural outgrowth of Peregrinese reflections on a logic-free 
conception of inference. Perhaps you can figure out a way to do so. You might start by looking into 
something called “‘multiple-conclusion logic.” 
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Axiom 7.11 (OR) 
PU{ovvyF-xy — GU{d}F yx and OU {y}F x. 


Students of logic may notice that the right-left direction of Or is an inference form 
known as SEPARATION OF CASES. 


Exercise 7.31 (P 7 P)' P. 
Exercise 7.32 Pt (PV Q). 
Theorem 7.12 Q'| (PV Q). 
Exercise 7.33 | @@(P Vv OP). 


Exercise 7.34 The LAW OF EXCLUDED MIDDLE (LEM) is the principle 


Dr (OV Og). 


Show that some instances of LEM are not provable. (You can show that something 
is unprovable by showing that it would allow you to prove something you already 
know to be unprovable. I can report that the addition of Axiom 7.11 does not allow 
us to prove any previously unprovable instances of Peirce’s law.) 


Exercise 7.35 (P 7 Q)} (@P > Q). 
Exercise 7.36 (OP > O)K @O@Q(PYV Q). 


Exercise 7.37 Show that some instances of the following scheme are unprovable: 


(Ode wr @VyY). 
Exercise 7.38 (OP&@ QO) Q(PV Q). 
Exercise 7.39 @(P 7 Q)E(@P&@Q). 


One of the following two sequents is provable; the other is not. 


OOP V7 OQ) (P&Q). 
(P&Q)F O(OP 7 OQ). 


Exercise 7.40 Prove the provable one. 


Exercise 7.41 Show that the unprovable one is unprovable. (If your chosen sequent 
were provable, then the result of replacing each occurrence of ‘Q’ with an occur- 
rence of ‘P’ would also be provable. Show that this new sequent would let us prove 
something unprovable.) 
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One of the following two sequents is provable; the other is not. 


(P&QOQ)>R)F (POV R)). 
(Po (QV R))F (P&OQ)> R). 


Exercise 7.42 Prove the provable one. 


Exercise 7.43 Show that the unprovable one is unprovable. (As in Exercise 7.41, 
feel free to manipulate your chosen sequent by replacing occurrences of ‘P’, ‘Q’, 
and ‘R’ with occurrences of other formulas. Then show that your new sequent would 
let you prove something unprovable.) 


Ireported (without proof) that Peirce’s law has instances not provable in our deduc- 
tive system. This allowed us to show that there are unprovable instances of double 
negation elimination and LEM. These facts may also have helped you do Exercises 
7.37, 7.41, and 7.43. This approach does not always work: there are unprovable 
sentences that do not yield any unprovable instances of Peirce’s law. An example is 


(OP TOO P) 
which is an instance of the PRINCIPLE OF TESTABILITY 
(OPV OO). 


This observation may help you with some of the following exercises. 


Exercise 7.44 Show that some instances of the following scheme are unprovable? : 


Dr (> Y) VUE 4)). 


One of the following two sequents is provable; the other is not. 


(OPV OQ)F O(P&Q). 
O(P&Q)F (@P VY @Q). 


Exercise 7.45 Prove the provable one. 


Exercise 7.46 Show that the unprovable one is unprovable. 


° Is it intrinsic to the concept of inference that, given any two sentences, one will be inferable from 
the other? 
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7.7 Assimilators 


To the east of the Peregrins lies the land of Boole. The Boolean language is similar 
to Peregrinese in many ways. For example, the Booleans have a symbol ‘L’ that 
obeys EFQ. They also have a symbol ‘—’ with some of the logical properties of the 
Peregrinese ‘@’. 


Proposition 7.3 
Prag &—> @GU{d} FL. 


Booleans will infer —¢ from @ if and only if they will infer anything from @ U {9}. 
This is the Boolean version of Exercise 7.17. The Booleans are an uninhibited lot 
and, unlike their cautious neighbors to the west, make free use of double negation 
elimination. 


Proposition 7.4 


{>79g} F . 


The most easterly Peregrins had frequent contact with their Boolean neighbors 
and adopted many of their customs. In particular, they found it useful to incorporate 
some Boolean elements into their dialect: the expression ‘—’, for example, with all 
its logical properties. They soon discovered, to their horror, that they were losing 
their cultural identity. 


Proposition 7.5 @@ PF P. 


Proof Here is a formal derivation. 


1. =P AP Taut 
2. =P, Pe iL 1 Prop 7.3 
3: =P + @P 2Thm7.5 
4,.0@OP, @PFL Ex 7.20 
5.@@P, ~=PFL 3,4 Cut 
6. @@Pt—P 5 Prop 7.3 
7. aap P Prop 7.4 
8. @@PtP 6,7 Cut 


The Peregrins were just trying to be nice. They thought they could ease relations 
with their neighbors by adopting some Boolean customs while holding tight to their 
own sacred traditions. Alas, Boolean logic does not tolerate difference. It does not 
coexist: it assimilates. !° (Resistance is futile.) Add Boolean negation to Peregrinese 
logic and the result is a copy of Boolean logic in which ‘@’ behaves just like ‘—’. 
(In more conventional terminology: add classical negation to intuitionist logic and 
the result is classical logic.) ‘>’ begins to act strangely too. 


'0 For more on assimilators, see Pollard [8-10]. The first two papers are available via http://www. 
projecteuclid.org. You can find the third at http://www.philosophy.unimelb.edu.au/ajl/ 
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Proposition 7.6 | (((P > Q) > P)~> P). 
Proof Just apply Theorem 7.11 and Proposition 7.5. 


Exercise 7.47 In the land of Peregrin, there is amonastic community whose members 


’ 


may only utter iy sentences and compounds of n? sentences formed with ‘&’, ‘>’, 
“L’, °@’, and ‘vy’. The monks infer ‘L’ from ‘| = ||’. They infer — — ¢ from 
(—@ DL). They infer ‘f (a) = |’ from ‘id(f (a), |) = ||’. Without necessarily giving 
a full-blown proof, indicate why the monks will infer @ from @ @ ~ whenever ¢ is a 
ia sentence. 


To the west of the Peregrins lies the land of Peirce. The Peirceans have a symbol 
‘>’ with some of the logical properties of ‘>’. Here is the Peircean version of 
Theorem 7.7. 


Proposition 7.7 
PU{o}- vy — > EF (ODY). 


A very un-Peregrinese trait, however, is the Peircean’s acceptance of every instance 
of Peirce’s law. 


Proposition 7.8 
OF (@IY)I4)D 9). 


Perhaps you can guess what happens when the neighborly Peregrins absorb ‘D’ into 
their dialect. 


Exercise 7.48 (P > Q), P} @. 

Proposition 7.9 (P > QQ)! (P- Q). 

Proof Just apply Exercise 7.48 and Theorem 7.5. 
Exercise 7.49 ((P > Q)> P)EF((PDQ)DP). 
Proposition 7.10 ((P > Q)D P)K P. 

Proof Just apply Propositions 7.7 and 7.8. 
Exercise 7.50 | (((P > Q) > P)~ P). 


Peircean logic does not tolerate difference. It does not coexist: it assimilates. Add 
the Peircean conditional to Peregrinese logic and the result is a copy of Peircean 
logic in which ‘>’ behaves just like “D’. (In more conventional terminology: add the 
classical conditional to intuitionist logic and the result is classical logic.) ‘@’ begins 
to act strangely too. 
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Proposition 7.11 @@ PFE P. 
Proof Apply Exercises 7.27 and 7.50. 


To the south of the Peregrins lies the land of Trivalence. The Trivalents recognize 
one form of truth (J) but two forms of untruth (U; and U2). They attribute T to 
a sentence ¢ by asserting ¢ itself. They attribute U; to ¢ by asserting — ¢. They 
attribute Uz to ¢ by asserting —— @. Since neither form of untruth is compatible 
with truth, the Trivalents enjoy two versions of EFQ. 


Proposition 7.12 


{d, > oF y. 
{0, z7 dF y. 


Since a declarative Trivalent sentence has to be in one of the three aforementioned 
states (T, U;, U2), exactly one of the sentences 


$,76,7—7 o 


will be true. So anything that follows from each of these three sentences will be true. 
More generally, if you are able to infer ~ whenever you add one of these sentences 
to ®, then you can infer ~ from @ alone. 


Proposition 7.13 
@U{d}- wv, ®U{— gy, SU{—— gv > @ekv. 


Any guesses about what happens when the most southerly Peregrins extend the hand 
of good fellowship and start using ‘—’? 


Exercise 7.51 O@ PF O-— P. 
Exercise 7.52 O@ P+ @—-— P. 
Exercise 7.53 @@ PFE P. 

Need I say it? ‘—’ is an assimilator. 


Exercise 7.54 The Peregrins have neighbors to the north who recognize one form 
of truth and three forms of untruth. You complete the story. 


Classical (two-valued) logic does not play nice. Neither does n-valued logic for 
any finite n. 
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7.8 The Glivenko/Goédel Theorems 


We now bid adieu to the Peregrins and consider a formal language with the following 
vocabulary. 


1. Three CONNECTIVES: ‘&’, ‘>’, and ‘vy’. 

2. One SENTENTIAL CONSTANT: ‘_L’. 

3. Infinitely many SENTENCE LETTERS: ‘P’, ‘Q’, ‘P;’, ‘Q1’,... 
4. Two PARENTHESES: ‘(’, *)’. 


We recursively define the SENTENCES of our language as follows. 


5. ‘L’ is a sentence. 
6. Every sentence letter is a sentence. 
7. If @ and w are sentences, then so are" (@&W)',"(@ > Ww) and (¢@7 w) 1. 


We introduce ‘@’ via Definition 7.1. F, will be the relation governed by Axioms 
7.1-7.11. (The ‘7’ stands for “intuitionist.”) Fc will be the relation governed by 
Axioms 7.1 — 7.11 and LEM. (The ‘C’ stands for “classical.’”’) A derivation of a 
sequent © -; w or © kc w will have the same form as those in the preceding 
sections: a series of numbered lines each featuring a sequent and a justification. In 
a classical derivation, we accept LEM as a justification for any sequent of the form 
Htc (b 7 @¢@). Feel free to use results we have already verified. Since every step 
in an intuitionist derivation is a correct step in a classical derivation, we obtain the 
following theorem. 


Theorem 7.13 Jf there is an intuitionist derivation of ® +7 W, then there is a 
classical derivation of ® Fe wW. 


The next three exercises will help us verify two theorems first proved by VALERII 
GLIVENKO (1897-1940).!! 


Exercise 7.55 (P > @Q),Q@@0P#, @Q. 
Exercise 7.56 ((P V OP) > @Q)+; OO. 


Exercise 7.57 ®U{(¢@V7 Od)}+; OW => PFE, Ov. 
Here is a close cousin of Exercise 7.57. 
Exercise 7.58 ®U{(@VO¢)}-ryW > PGF, OOy. 


The only distinctively classical features of one of our classical derivations will be 
lines invoking LEM. Exercise 7.57 implies that every such line is eliminable if our 
conclusion is a negation. That is, we have a technique for transforming a classical 
derivation of a negation into an intuitionist derivation. The following result will help 
us confirm this. 


'l See Glivenko [2]; English translation in Mancosu [5, pp. 301-305]. 
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Suppose we are presented with a classical derivation of the sequent & Kc w. Each 
application of LEM in the derivation will introduce a sequent J Fc (67 @¢). Let the 
members of Y be the sentences (¢ V @@) on the right-hand sides of those sequents. 
We claim there is an intuitionist derivation of the sequent 5 UY’ +; w for some 
subset Y’ of Y. Our proof is by induction on the length of the classical derivation. 
That is, having confirmed that derivations of length | satisfy our claim, we will show 
that derivations of length n+ | will satisfy our claim as long as derivations of length n 
do. Suppose the derivation consists of just one line. If that line features an instance of 
LEM, 4c (¢v@¢@), then we just note that, by Taut, there is an intuitionist derivation 
of {(67 @¢)} -7 (6V @¢@). The only other possible occupants of the one line would 
be instances of the intuitionist Axioms 7.1, 7.6, 7.7, 7.8, 7.9, or 7.10. (All the other 
axioms involve a derivation from a previous line.) Now suppose & kc w sits at the 
end of a derivation with more than one line. The axioms we just discussed present no 
new problems. So we need only consider Axioms 7.2, 7.3, 7.4, 7.5, and 7.11. Suppose 
the last line of the classical derivation features an application of Axiom 7.2 (Thin). 
Then & is of the form ® U{¢} and the sequent ® Fc w appears on an earlier line. By 
our inductive hypothesis, there is an intuitionist derivation of 6 UY’ +; w for some 
subset Y’ of Y. So, by Thin, there is an intuitionist derivation of ® U{d}UY’ F; w. 
That is, there is an intuitionist derivation of SUY’ -; 7, as desired. Now suppose the 
last line of the classical derivation features an application of Axiom 7.3 (Cut). Then 
& is of the form @ U W and sequents W Fc x and & U {x} Fc w appear on earlier 
lines. By our inductive hypothesis, there are intuitionist derivations of YU Y’ Fy; x 
and @ U {y} UY” F; w for some subsets Y’ and Y” of Y. So, by Cut, there is an 
intuitionist derivation of 6UWUY'UY" F, = where Y’U Y” is a subset of Y, as 
desired. Suppose the last line of the classical derivation features an application of the 
left-right direction of Axiom 7.4 (And). Then & is of the form {(@; & @2)} and the 
sequent {@1, 62} Kc W appears on an earlier line. By our inductive hypothesis, there 
is an intuitionist derivation of {¢1, g2} U Y’ Fy; ~ for some subset Y’ of Y. So, by 
Theorem 7.3, there is an intuitionist derivation of {(¢; & 62)} U Y’ F; w. Exercise 
7.2 lets us handle the left-right direction of Axiom 7.4 in a similar way. I leave it to 
you to show how to handle applications of Axioms 7.5 and 7.11. 


Theorem 7.14 [f there is a classical derivation of B Fe @w, then there is an 
intuitionist derivation of ® Fy Ow. 


Proof Suppose we are presented with a classical derivation of Hc @w. Then, as 
we just showed, there is an intuitionist derivation of 6 UY +; @w where Y isa 
finite set of sentences of the form (¢ V @@). Finitely many applications of Exercise 
7.57 yield an intuitionist derivation of ® F; @w. (If you have any doubts, you can 
use induction on the size of Y to confirm this last claim.) 


Theorem 7.15 [f there is a classical derivation of ® Kc w, then there is an intu- 
itionist derivation of B -; @ © w. 


Proof If there is a classical derivation of ® Fc wy, then, by Exercise 7.14, there is a 
classical derivation of ® Fc @ © w. Now apply Theorem 7.14. 
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We will now see how Kurt Gédel improved on Glivenko’s results.!? First some 
definitions. 


Definition 7.2. ¢ is an INTUITIONIST THEOREM if and only if there is an intuitionist 
derivation of F; ¢. 


Definition 7.3. @ is a CLASSICAL THEOREM if and only if there is a classical deriva- 
tion of AF ¢ @. 


Definition 7.4 ¢ and w are INTUITIONISTICALLY EQUIVALENT if and only if there 
are intuitionist derivations of {@} F, w and {yw}, ¢. 


Definition 7.5 ¢ and w are CLASSICALLY EQUIVALENT if and only if there are clas- 
sical derivations of {¢} Fc Ww and {w} Fc ¢. 


Definition 7.6 A {@, &}-SENTENCE is a sentence in which no connectives other than 
‘@’ and ‘&’ occur. 


Inote two facts without proof. First, neither ‘L’ nor any sentence letter is a classical 
theorem. Second, every sentence of our formal language is classically equivalent to 
a {@, &}-sentence. 


Definition 7.7 The LENGTH of a sentence is the number of occurrences of connec- 
tives within it. 


For example, ‘L’ and ‘P’ have length 0, while ‘@ @ (P 7 (P > Q))’ has length 
4 (with two occurrences of ‘@’ and one occurrence each of ‘V’ and ‘>’). 


Theorem 7.16 A {@, &}-sentence is a classical theorem only if it is an intuitionist 
theorem. 


Proof Our proof will be by induction on the length of our sentences. That is, having 
confirmed that sentences of length 0 satisfy the theorem, we will show that sentences 
of length n + 1 will satisfy the theorem as long as sentences of length n do. Since no 
sentence of length 0 is aclassical theorem, every sentence of length 0 trivially satisfies 
the theorem. A {@, &}-sentence with non-zero length will be either a negation or a 
conjunction. Theorem 7.14 takes care of the negations. In the case of a conjunction 
(@ & w), our inductive hypothesis is that both @ and w satisfy the theorem. Suppose 
(@ & w) is a classical theorem. Then ¢ and w are classical theorems and, hence, ¢ 
and w are intuitionist theorems. It follows, as desired, that (¢ & ) is an intuitionist 
theorem. 


We now define an INTERPRETATION FUNCTION i as follows. i(¢) = @ whenever 
¢ is ‘L’ ora sentence letter. Furthermore: 


!2 See Godel [3]; English translation in Gédel [4, pp. 287-295]. 
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i(P&Y) = G(G) &iW)) 
i(P> ¥) = OF) & OiW)) 
dV v) = O(i(P) & Diy). 


Note that 
i(@¢) =i(@bL) = O(i(d)&@ 1). 


You can use Theorems 7.1 and 7.9 and Exercise 7.10 to derive @(i(¢)&@ L) 
from @i(¢) and vice versa. So i(@@) and @i(¢) are intuitionistically and classically 
equivalent. It follows that you can interchange these sentences without affecting 
derivability. That is, if you replace occurrences of one sentence with occurrences of 
the other in an intuitionistically or classically derivable sequent, the result will be 
an intuitionistically or classically derivable sequent. !? Since i(@@) and @i(¢) are 
interchangeable, it is harmless to treat them as identical: 


i(@¢) = @i(¢). 


The function i transforms every sentence into a classically equivalent {@, &}- 
sentence. For example, 


i(@O@PP P)=OOOP& OP). 


This lets us show that 7 assigns an intuitionist theorem to every classical theorem. 
Theorem 7.17 Jf w is a classical theorem, then i(y) is an intuitionist theorem. 


Proof If w is a classical theorem, then so is i(w) (since w and i(w) are classically 
equivalent). But then, by Theorem 7.16, i(q) is an intuitionist theorem. 


No classical derivation of a theorem is pointless from an intuitionist perspec- 
tive. Every such derivation yields an intuitionist theorem. If the classical theorem 
is a {@, &}-sentence, then, by Theorem 7.16, that sentence is itself an intuitionist 
theorem. In every other case, we can apply Theorem 7.15 or 7.17. 


7.9 First-Order Logic With a Decidable Relation 


We will now confirm that a similar result applies to a much more expressive formal 
language. Our new language features the following vocabulary. 


1. Three CONNECTIVES: ‘&’, ‘>’, and ‘vy’. 
2. One QUANTIFIER: ‘V’. 

3. One RELATION symbol: ‘R’. 

4, One SENTENTIAL CONSTANT: ‘L’. 


'5 For a proof, see Martin [6, p. 168]. 
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5. Infinitely many VARIABLES: ‘w’, ‘x’, ‘y’, ‘z’, ‘wi’, ‘x1’, ‘yi’, ‘Zr, 
6. Two PARENTHESES: ‘(’, ‘)’. 


We recursively define the FORMULAS of our language as follows. 


7. ‘1? is a formula. 

8. If a and @ are variables, then " RaJ"! is a formula. 

9. If @ and w are formulas, then so are" (@&W)',"(@ b> Ww), and (7 Ww)". 
10. If @ is a variable and ¢ is a formula, then "Va @' is a formula. 


We introduce ‘@’ via Definition 7.1. We define FREE and BOUND occurrences of 
variables as in Chap. 2. 


Axiom 7.12 If $(@) is the result of replacing every free occurrence of the variable 
a in @(a) with a free occurrence of the variable (, then 


{Va P(a)} F G(9). 


If a = (, this takes the form 
{Vad} F ¢. 


Axiom 7.13 If the variable a does not occur free in any member of @, then 
@r&o S> Gt VaePd. 


Axiom 7.14 
Ht VxVy(Rxy 7 ORxy). 


Axiom 7.14 says that the relation R satisfies a version of LEM. Such a relation 
is said to be DECIDABLE. Now let ; be the relation governed by Axioms 7.1-7.14. 
Let Fc be the relation governed by Axioms 7.1—7.13 and LEM. Since Axiom 7.14 
is a classical theorem, we once again have: 


@trw => Pkcy. 
We extend our interpretation function i to our new language by stipulating: 


i(RaZ) = RaG 
i(WVad) = Vai(¢). 


When we assess the length of a formula we will now count occurrences of the 
quantifier “V’ as well as occurrences of connectives. For example, ‘Vx @ Vy @ Rxy’ 
has length 4. 


Exercise 7.59 6+) VxVy(@ @ Rxy > Rxy). 
Exercise 7.60 © @ (¢&wW) FF; OOK O Oy). 
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Exercise 7.61 {¢} 7 => {Vad} Vay. 
Exercise 7.62 {2 @Vad}; Va @ @¢. 


Exercise 7.63 Use induction on the length of the formula i(@) to show: 


{Q Oi(P)} F1 i(¢). 
You might find Exercise 7.15 useful. 


Exercise 7.64 {(i(¢) > i(W))} Hr i(o > y). 


Exercise 7.65 {i(¢ > W)} 7 ((d) & iW)). (You might take a look at Exercises 
7.19 and 7.24.) 


Exercise 7.66 {i(¢)} Fr i(@V Y) and {i)} Fr i@v Y). 
Exercise 7.67 


PULP} Fr i(y) and PULM} FiO) > PULIOV Y)}F1 iOO. 


Exercise 7.68 Show that if 
Dis e++s On Fo 


is an instance of a classical axiom, then there is an intuitionist derivation of 


i(1), ---,t(Gn) Fr i). 


You will need to check all the classical axioms that allow you to write down a sequent 
without any help from earlier sequents. They are: 7.1, 7.6, 7.7, 7.8, 7.9, 7.10, 7.12, 
and LEM. You will probably find Exercises 7.64 and 7.65 useful. 


Exercise 7.69 Suppose we have a classical derivation of {¢1, ..., On} Ke W. Use 
induction on the length of that derivation to show that there is an intuitionist deriva- 


tion of {i(91), .--,i(bn)} Fr i). 


If we add more relation symbols to our language, then, as long as the corresponding 
versions of Axiom 7.14 are intuitionist theorems, the preceding results will continue 
to hold. We might, for example, add symbols meant to express the relations “x is the 
successor of y,” “x is the sum of y and z,” and “x is the product of y and z.” All three 
relations are primitive recursive and intuitionists agree that all three are decidable. 
Let HA be the result of retaining all the axioms of PA but switching the underlying 
logic from classical to intuitionist.'+ Then the result of applying an interpretation 
function i with the above properties to an axiom of PA is always a theorem of HA. 
So, as Gédel was quick to observe, PA is consistent if HA is. For suppose PA is 
inconsistent. Then, for some PA axioms ¢),..., ¢, and any PA formula 2, 


14 The ‘H’ in HA commemorates the mathematician AREND HEYTING (1898-1980). 
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{d1,---, On} Fo W& Oy). 


Applying a version of Exercise 7.69, we get: 


{i(1),---,i(On)} Fr UP & Oy). 


But 
ivw& Ov) =H) & Dily)). 


So HA is inconsistent if PA is.!° If you accept the consistency of intuitionist arith- 
metic, you are in no position to question the consistency of PA. 


7.10 Curry’s Paradox 


We conclude this chapter, and the book, with another example of how perilous set 
theory can be. Suppose a Peregrinese mathematician proposes the following. 


Proposition 7.14 (SET) 
Be{a: 0(a)} AF Af). 


The idea is that 3 will belong to the set of those things that have property 0 if and 
only if 3 has property @. 3 will belong to the set of prime numbers if and only if 3 
is a prime number. Our experience with lists back in Chap.3 should make us wary 
here: recall the non-existent list of all lists that do not list themselves. Let us consider 
the Peregrinese set of all sets that are not members of themselves: {x : x ¢ x} or, 
in unabbreviated notation, {x : (x € x >1)}. For brevity’s sake, let us call this set 
“c’. Note that c is a member of itself if and only if (c € c > L). 


Proposition 7.15 FL. 


Proof Here is a formal derivation. 


it ceck(cecpl) Set 
2.céc, (cecPLb MP 
3; ceckFkl 1,2 Cut 
4. t (cecb>Ll) 3Ded 
Ds (cEecbLlL)Fcec Set 
6. Fcec 4,5 Cut 
7. FL 3,6 Cut 


Note that we did not use Int. Skepticism about that rule will not keep you out 
of trouble here. Even more interesting: we did not use EFQ. Since we used no 


'S Again, see Godel [3]; English translation in Godel [4, pp. 287-295]. 
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special information about ‘_L’, the proof will still work when we replace *_L’ with 


any sentence. So, given only Cut, Ded, and MP as our logical apparatus, the rule Set 
allows us to show that every sentence is a theorem.!° 


7.11 Solutions of Odd-Numbered Exercises 


7.1 
1. {(@1 &G2)} F (Pi &p2) Taut 
2. {b1, G2} F (b1&d2) 1 And 
7.3 
i; S;,PF@Q Prem 
2; So, PER Prem 
3. Q,R'(Q&R) Ex7.1 
4. S,, P,R + (O&R) 1,3 Cut 
5. S;, So, PF (O&R) 2,4 Cut 
75 
1. (P&(Q&R)) - P Thm 7.1 
2. (P&(Q&R)) | (O&R) Thm 7.2 
3. (Q&R)F QO Thm 7.1 
4. (Q&R)F R Thm 7.2 
5. (P&(Q&R))F O 2,3 Cut 
6. (P&(Q&R)) EF R 2,4 Cut 
7. (P&(Q&R)) - (P&Q) 1,5 Ex 7.3 
8. (P&(Q&R)) F ((P&O)&R) 6,7 Ex 7.3 
7.7 
1. Pi,..-, Pn } (QR) Prem 
2. O,(QER)FR MP 
3. Pi,..., Pn, OE R 1, 2 Cut 


'6 The American logician HASKELL CURRY (1900-1982) noted this in [1] (http://www.jstor.org/ 
stable/2269292). 
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7.9 
1. P,(PoQ)FQ MP 
2. Q,(0>L)F MP 
3. P,(P PQ), (Q@b>L)F 1,2 Cut 
4. (P>Q),(Q@bL)F (PBL) 3 Thm 7.6 
5. (Po O)EC(OpL) > (P BL) 4Thm7.5 
7.11 
lL P,(Pe(@el))F (QbL) MP 
2. Q.(Q@r>L)FL MP 
3. P,Q,(P>(QbL))F 1,2 Cut 
4. QO,(Pe(QpL))t (PBL) 3 Thm 7.6 
5 (P>(Ob1L))- (Ob (P BL)) 4Thm7.5 
7.13 
1. LE (Ppl) Int 
7.15 


1. PFE OOP Ex7.14 
2,.00@OP' OP 1£Ex7.10 


7.17 Since ‘@Q’ is shorthand for ‘(Q tL)’, this exercise is just an instance of 
Theorem 7.7. 


7.19 
L. P,(P>PQ-FQ MP 
2. P,@QEO(PE OQ) 1Ex7.10 
3.P,00(PEQ)+ OOO 2 Ex 7.10 
4. @@(PEPQ)E (PP OO Q) 3Thm7.5 
7.21 
1 P,@PFQ Ex 7.20 
2. @P + (Pb Q) 1 Thm7.5 
3.Q@(P > Q)- OOP 2Ex7.10 
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7.23 
1L(P>O@OQ+*(@OOrPOP) Ex7.11 
2. (OO OP) OO(P PQ) Ex7.22 
3.(PPOOQ)+ OO(PPQ) 1,2Cut 
7.25 
1. ne) _| Prem 
2, - @@ L 1Gen 
3.@@LEL Ex 7.16 
4. a 2, 3 Cut 


7.27 Note that ‘((@P © P) > P)’ is an instance of Peirce’s law because it is 
shorthand for ‘(((P >L) > P) b> P)’. 


1. Lt (@P > P) > P) Prem 
2. @P,OOPHEP Ex 7.20 
3. @OPt(@P- P) 2 Thm 7.5 
4.(0P>P),((OP>P)>P)EP MP 
5. (OP >P)EKP 1,4 Cut 
6. @@PEP 3,5 Cut 
7.29 
1. Q(P> OF OOP Ex 7.21 
2.(P>Q),((P>Q)>P)KP MP 
3. P+ QOP Ex 7.14 
4.(P>Q),(POQ)>P)F OOP 2,3 Cut 
5. (P > Q)>P)F @@P 1,4Thm7.10 
7.31 
I, PE P Taut 
2.(PV P)- P10Or 
7.33 
l PE(PY OP) Ex 7.32 
2.@P + (PV OP) Thm 7.12 
3. +t @@(PV@P) 1,2 Thm7.10 
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7.35 


7.37 


7.39 


1 P,@PFQ Ex 7.20 
2. P+ (@P > Q) 1Thm7.5 
3. Q'+ (@PPQ) 2 Int 
4.(PVQ)-(@PeOQ) 2,30r 
1. LK (@P > OP) Ex7.6 
2.(@P > @P)t (PV @P) Prem 
3. LK (PVyV@P) 1,2Cut 
1. PE(PVQ) Ex 7.32 
2: Qt (PYVQ) Thm 7.12 
3. O(P 7 Q) F OP 1 Ex 7.10 
4.0(P 72) OQ 2 Ex 7.10 
5.Q(P 7 QO) (@P&@ Q) 3,4Ex7.3 
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7.41 By Theorem 7.11, we can prove every instance of Peirce’s law if we can prove: 
@@ PF P. Now note the following. 


7.43 


— 


SSMIDAARWNE 


1.(@@P&OOP)+' O(O@PVY OP) Ex 7.38 
2 @@PF(@OP&OOP) Ex7.1 
3. @@P+'O@OPYT OP) 1, 2 Cut 
4. Q(@P 7 @P)+| (P&P) Prem 
5 @@P+t (P&P) 3,4 Cut 
6 (P&P)t P Thm 7.1 
7 @@PtFP 5, 6 Cut 


((P > P)& @ P) > OP) 
@P 

(P > P),@P 

(P & P)& © P) 


(PD P) > (PY @P)), (P > P) 
(P > P) 


t (P > P) > (PY OP)) 
/ @P 

/ @P 

/ @P 

t ((P > P)& @ P) > OP) 
t (P > P) > (PY @P)) 
F (P 7 @P) 

t (P 7 OP) 

L (P > P) 

t (P 7 @P) 


Prem 
Taut 

2 Thin 
3 And 
4 Ded 
1,5 Cut 
MP 

6, 7 Cut 
Ex 7.6 
8, 9 Cut 
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7.45 


(P&Q)k P Thm 7.1 
(PK&O)EK QO Thm 7.2 

@P + @(P&Q) 1 Ex7.10 
20+ @(P&Q) 2Ex7.10 
(PV OQ) O(P&Q) 3,40r 


WRENS 


7.47 The point is that the negation we defined in Chap. 1 has classical properties 
that will be inherited by ‘@’. To work this out in detail, we might reproduce the 
argument we gave in support of Proposition 7.5. We begin with the odd assumption 
that f(a) = | and f(a) ¢ |. We just to want to see what the monks will infer from 
this. By Exercise 1.16 and Definition 1.8, id( f(a), |) = || and id( f(a), |) = | and, 
hence, | = ||. From this, the monks infer L. So the following is a monastically correct 
sequent: 


f@M=l|, f@MAILEL. 


But, by Exercise 7.17, this yields 


fi@ Al Fr O(f@ =)) 


and, hence, as in the proof of Proposition 7.5 


OO(fM=l),f@ALEL. 


So, by Theorem 7.5, 


2O(f@=l) F (f@) Al >1) 


and, hence, 


2o(f@=|) F —-(Cf(@) = I). 


Now suppose —(—(f (a) = |). That is, id(id(f (a), |), |) = |. By Exercise 1.18, this 
implies that 
id( f(a), I) +] =Ill 


and, hence, id( f(a), |) = ||. From this, the monks infer that f(a) = |. So the 
following is a monastically correct sequent 


-C(f@=)) F f@=| 


and, hence, so 1s 


Ce(fM=l)F f@M=l. 


7.11 Solutions of Odd-Numbered Exercises 193 


7.49 
1. (PD Q)F (PDQ) Prop 7.9 
2.((PbQ)>P),(P>Q)FQ MP 
3.((P>O)PP),(PDO)FQ 1,2 Cut 
4. ((P & Q) > P)- ((P D Q) D Q) 3 Prop 7.7 
7.51 
1. .Pe= PL Prop 7.12 
2. —+PEOP 1 Ex 7.17 
3. O@P+LO@— P 2Ex7.10 
7.53 
1. Q@@PFO-—P Ex 7.51 
2. Q@@P+O——>P Ex 7.52 
3. @OP,7PEL 1 Ex 7.17 
4.@@P,77>7 PEL 2 Ex 7.17 
5. LEP EFQ 
6 @@P,>PHP 3,5 Cut 
7.@@P,77 PEP 4,5 Cut 
8. PEP Taut 
9. @OP,PtP 8 Thin 
10. @@OPKP 6, 7, 9 Prop 7.13 
7.55 
1. (P>@Q)ti (QE O@P) Ex7.11 
2. (P > @Q),Q+; OP 1 Ex 7.7 
3. @P,@O@PHL Ex 7.20 
4.(P>@Q),@@OP,Qt,; 1 2,3 Cut 
5. (P>O@Q),@OP+t; OQ 4 Ex 7.17 
7.57 
1. PU{(PV OG)} Fr OW Prem 
2. ®t, (dV Od) > OW) 1 Thm 7.6 
3. {((d V7 Od) & OW)} Fy OW Ex 7.56 
4. Pt; OW 2,3 Cut 
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7.59 
i: Ly, VxVy(Rxy 7 ORxy) Ax 7.14 
2. VxVy(Rxy 7 ORxy) Fy, Vy(Rxy 7 ORxy) Ax 7.12 
35 Ly Vy(Rxy 7 ORxy) 1,2 Cut 
4. Vy(Rxy 7 ORxy) Fy (Rxy 7 ORxy) Ax 7.12 
5. Ly, (Rxy 7 ORxy) 3,4 Cut 
6. Rxy Fy (@@ Rxy > Rxy) Int 
7. @Rxy,@@ Rxy F; Rxy Ex 7.20 
8. @Rxy Fy; (© @ Rxy & Rxy) 7 Thm 7.5 
9. (Rxy V ORxy) Fi (OO Rxy Bb Rxy) 6,8 Or 
10. Ly; (Q@ Rxy > Rxy) 5,9 Cut 
11, Ly Vy\(O@Rxy > Rxy) 10Ax7.13 
12. Ly VxVy(@ @ Rxy > Rxy) 11 Ax 7.13 
7.61 
1 {d} Fy w Prem 
2. {Vad} Fy, ob Ax 7.12 


} 
} 
3. Vad} Ey w 1,2 Cut 
4. {Vad} Ky Vac 3 Ax 7.13 


7.63 If ¢ has length 0, then it is of the form Ra@ and we have to show: 
{O @ RaB}F, Raf. 
This follows from Exercise 7.59. If ¢ is a negation @w, we have to show: 
{O @ @i()} Fr Oi). 
This follows from Exercise 7.15. If ¢ is a conjunction (w~;&w2), we have to show: 


{2 @ GY) &i(Y2))} Fr CHD) &i(Y2)). 


We reason as follows. (The premises on lines 4 and 5 are our inductive hypotheses.) 


1. {@ @ (iW )&i(W2))} Hr (@ Oi(Y1)& @ @i(W2)) ‘Ex 7.60 
2. (0 OiW1)& @ @i(y2))} Fr ODI) Thm 7.1 
3. {((2 Mi(w)& @ Vi(v2))} Hy @ @ i(Y2) Thm 7.2 
4, {2 Oi} Fr 11) Prem 
5. {@ @ i(w2)} Fy 12) Prem 
6. {((O @ i(Y1)& @ @i(w2))} Fr 11) 2, 4 Cut 
7. {((O @ iY )& @ @i(w2))} Fr ir) 3, 5 Cut 
8. (OD Di1)& © Oiw2))} Fr EW1)&i (W2)) 6, 7 Ex 7.3 
9. {2 @ Gi) &i(Y2))} Fr C1) &i2)) 1,8 Cut 
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If ¢ is a conditional (¢; > wW2), we have to show: 
{© 0 OF (W)& @ i(Y2))} Fr O@WYE& @ i(2)). 


This follows from Exercise 7.15. If ¢ is a disjunction (w; V 2), we have to show: 


{O @ O(@i(Y)& O i(2))} Fr Oi) & O i(w2)). 


This follows from Exercise 7.15. If ¢ is a universal generalization Va w, we have to 
show: 


{0 @Vai(w)} Fy Vai). 


We reason as follows. (The premise on line 3 is our inductive hypothesis.) 


1. {@ OVai(y)} Ky Va@ @i() Ex 7.62 
2. (Va @ @i(Y)} FH; ODiW) Ax 7.12 
3. {@ @i(w)} Fr i) Prem 
4. {@ @Vai(y)} Hy @ @iW) 1,2 Cut 
5. {@ @Vai(W)} Ky i() 3,4 Cut 
6. {@ OVai(w)} Hy Vai) 5 Ax 7.13 


7.65 
1 {QU(M)& Oi(w))} Hr @ @ G(P) Bi) Ex 7.24 
2 {2OUO PIW))} Fr (O)>OOiW)) Ex719 
3 {OUP OiW)} Fi (O)PO@M)) 1,2 Cut 
4. {O(i(P)& O i()), 1(M)} F1 © Oi) 3 Ex 7.7 
5, {@ @i)} Ky iW) Ex 7.63 
6. {OM (P)& O i()), 1(H)} Fr i) 4, 5 Cut 
7. {OU (P)& Oi())} Fr (YP) & i)) 6 Thm 7.5 

7.67 
di. @P U{i(d)} F ity) Prem 
2. PU{iW)} F iy) Prem 
3. PU{DIO)} F Gil) 1 Ex 7.10 
4. PU{DWI~W} F CI) 2 Ex 7.10 
5. PU{BI(Y)} K (@i(d)& @ i(wW)) 3, 4 Ex 7.3 
6. PU {O(SVi(P)& @iWw))}} ! SPOWiCn”) 5 Ex 7.10 
7. {2 Qi} Fr iGO Ex 7.63 
8. PU{@(@i(P& @iW))} k iO) 6, 7 Cut 


7.69 Suppose we have a classical derivation of & kc w. If the derivation is of length 
1, we just apply Exercise 7.68. If the derivation is longer, Exercise 7.68 lets us focus 
on Axioms 7.2, 7.3, 7.4, 7.5, 7.11, and 7.13. Suppose the last line of the classical 
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derivation features an application of Axiom 7.2 (Thin). Then & is of the form U{¢} 
and the sequent ® c w appears on an earlier line. Let i[®] be the result of applying 
the interpretation function i to each member of ®. By our inductive hypothesis, 
there is an intuitionist derivation of i[®] Fy, i(w). An application of Thin yields 
i[®] U {i(d)} F; i(), as desired. Now suppose the last line of the classical derivation 
features an application of Axiom 7.3 (Cut). Then & is of the form ® U W and sequents 
Wc x and @ U {x} Fc w appear on earlier lines. By our inductive hypothesis, 
there are intuitionist derivations of i[W] F; i(y) and i[®] U {i(y)} Fy i). So, by 
Cut, there is an intuitionist derivation of i[@ U W] +; i(w), as desired. Suppose the 
last line of the classical derivation features an application of the left-right direction 
of Axiom 7.4 (And). Then & is of the form {(¢;&@2)} and the sequent {¢;, d2} Fc 
w appears on an earlier line. By our inductive hypothesis, there is an intuitionist 
derivation of {i(¢1), 1(¢2)} Fy i). So, by And, there is an intuitionist derivation of 
{(i(@1) &i(h2))} Fy, i(w). By the definition of the function 7, {i(¢;&@2)} Fy iW), 
as desired. A similar argument applies to the right—left direction of Axiom 7.4. 
Suppose the last line of the classical derivation features an application of Axiom 
7.5 (Ded) yielding the sequent 0 'c (¢ > Ww). Then the sequent {6} Fc w appears 
on an earlier line. By our inductive hypothesis, there is an intuitionist derivation of 
{i(@)} Fy; i(@). So, by Ded, there is an intuitionist derivation of 6; (i(d) > i(W)). 
Exercise 7.64 lets us derive § F; i(¢ > w), as desired. Suppose the last line of the 
classical derivation features an application of the left-right direction of Axiom 7.11 
(Or) yielding the sequent ® U {d} Kc xv. Then the sequent PU{¢d 7 w} Kc x appears 
on an earlier line. By our inductive hypothesis, there is an intuitionist derivation of 
i[®] U {i(¢ 7 w)} Fy iC). Exercise 7.66 lets us derive i[®] U {i(@)} Fy, iy), as 
desired. Suppose the last line of the classical derivation features an application of 
the right—left direction of Axiom 7.11 (Or) yielding the sequent  U{@ 7 ~} Fc x. 
Then the sequents  U {¢} Fc x and @ U {w} Fc x appear on earlier lines. By our 
inductive hypothesis, there are intuitionist derivations of i[®] U {i(@)} Fy i(y) and 
i[®] U {i(w)} Fy ity). So, by Exercise 7.67, there is an intuitionist derivation of 
i[®]U{i(@vw)} 7 i(y), as desired. Suppose the last line of the classical derivation 
features an application of Axiom 7.13 yielding the sequent ® Fc Va @ with a not 
occurring free in any member of ®. Then the sequent kc ¢ appears on an earlier 
line. By our inductive hypothesis, there is an intuitionist derivation of i[®] 7 i(@). 
Inspection of the definition of the function i shows that the variable a does not occur 
free in any member of i[®]. So, by Axiom 7.13, there is an intuitionist derivation of 
i[®] +; Vai(¢). The definition of i assures us that i[®] Fy; i(Va @), as desired. 
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